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1. INTRODUCTION 

You have implemented a class [Dahl and Nygaard 1966; Arnold and Gosling 1998], 
FIFO, whose instances are FIFO queues with public methods enqueue and dequeue 
as well as method size that reports the number of elements in the queue. The class, 
implemented in some Java-like object-oriented language, is part of a library and 
is used by many programs, most unknown to you. The queue is represented using 
a singly linked chain of nodes that point to elements of the queue. There is also 
a sentinel node [Gormen et al. 1990]. Each instance of FIFO has a field num with 
the number of nodes and a field snt that references the sentinel. You realize that a 
simpler, more efficient implementation can be provided without the sentinel, using 
two fields, head and tail, pointing to the end nodes in the chain. You revise method 
size to return num instead of num — 1 and revise the other methods suitably. You are 
guided to the necessary revisions by thinking about the correspondence, sometimes 
called a simulation relation, between the representations for the two versions. 

Gan the revisions aff'ect the behavior of clients, that is, programs that use class 
FIFO in some way or other? The answer would be yes, if some client determined 
the number of nodes by reading field num directly. A client that refers to field 
name snt would no longer compile. But you have taken care to encapsulate the 
queue's representation: the fields are declared to be private. By using programming 
language constructs like private fields you aim to ensure that client programs depend 
only on the abstraction provided by the class, not on its representation. If client 
behavior is independent from the representation of FIFO, it is enough for you to 
ensure equivalent visible behavior of the revised methods. 

For scalable systems, scalable; system-building tools, and scalable development 
methods, abstraction is essential. For reasoning about a single component, e.g., 
a class, module, or local block, abstraction makes it possible to consider other 
components in terms of their behavioral interface rather than their internal repre- 
sentation.^ Abstraction is needed for the automated reasoning embodied in static 
analysis tools [Gousot and Gousot 1977] and it is needed for formal and informal 
reasoning about functional correctness during development and evolution [Milner 
1971; Hoare 1972]. Modular reasoning has always been a central issue in software 
engineering and in static analysis. With the ascendancy of mobile code it has be- 
come absolutely essential. For example, it is possible for clients of FIFO to be linked 
to it only at runtime, so it is impossible to check all uses to determine whether the 
revisions affect them. 

The need for flexible but robust encapsulation mechanisms to support data ab- 
straction has been one of the driving forces in the evolution of programming lan- 
guage design, from type safety and scoped local variables to module and abstract 
data type constructs [Liskov and Guttag 1986]. There is a rich theoretical liter- 
ature on the subject (e.g., [Plotkin 1973; Reynolds 1974; Donahue 1979; Haynes 
1984; Reynolds 1984; He et al. 1986; Mitchefi 1996; Lynch and Vaandrager 1995; 
de Roever and Engelhardt 1998]). Many different language constructs have been 
studied. There is considerable variation in the details of these theories, partly 
because the intended applications vary from justifying general tools for program 



^Even a primitive type like int is an abstraction from the machine representation. 
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Fig. 1. A FIFO object with its encap- 
sulated representation: private fields and 
nodes of a list (within the dashed rectan- 
gle). One element of the queue is shown 
as well as a user of the queue, but other 
objects and references are omitted. The 
dotted reference is an example of repre- 
sentation exposure. 

analysis and transformation to justifying proof rules to be applied to specific pro- 
grams as in the FIFO example. The common thread is that two implementations of 
a component are linked by a simulation relation between the two representations. 

Unfortunately, these theories are inadequate for object-oriented programs. They 
deal well with the encapsulation of data structures that correspond directly to some 
language construct, such as modules, local variables, or private fields. But the FIFO 
example also involves encapsulation of a data structure composed of heap cells and 
pointers, including aliasing with the tail field as depicted in Fig. 1. 

The problem is that encapsulation provided by language constructs often runs 
afoul of aliasing. For variables and parameters, aliasing can be prevented through 
syntactic restrictions that are tolerable in practice (and often assumed in formal 
logics and theories). Aliasing via pointers is an unavoidable problem in object 
oriented programming where shared mutable objects are pervasive. Yet unintended 
aliasing can be catastrophic. A version of the Java access control system was 
rendered insecure because a leaked reference to an internal data structure made it 
possible to forge crytographic authentication [Vitek and Bokowski 2001]. In simply 
typed languages, types oS'er limited help: variables x, y arc not aliased if they have 
different types. Even this help is undercut by subclass polymorphism: in Java, a 
variable x of type Object can alias y of any type. 

The ubiquity and practical significance of the issue is articulated well in the man- 
ifesto of Hogg et al. [1992]. A number of subsequent papers in the object-oriented 
programming literature propose disciplines to control aliasing. Of particular rele- 
vance are disciplines that impose some form of ownership confinement that restricts 
access to designated "representation objects" except via their "owners" , to prevent 
representation exposure [Leino and Nelson 2002]. A good survey on confinement, 
especially ownership, can be fotmd in the dissertation of Clarke [2001]; see also Lea 
[2000], Vitek and Bokowski [2001], Clarke et al. [2001], Miiller and Poetzsch-Heffter 
[2000], Boyland [2001], Aldrich et al. [2002], and the related work section of this 
paper. 

In Figure 1, an instance of class FIFO (the owner) uses private fields to point 
to objects intended to be part of its encapsulated representation, as indicated by 
the dashed rectangle. The contribution of this paper is a theory of representation 
independence for encapsulation of data in the heap, using ownership confinement. 
We follow Reynolds [1984] in calling our main result an abstraction theorem. Some 
readers may prefer the term relational parametricity. 

The literature on confinement is largely concerned with static or dynamic checks 
to ensure invariance of various confinement properties. One of our contributions is 
to show how established semantic techniques can be used to evaluate confinement 
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disciplines. To prove our abstraction theorem, we use a semantic formulation of 
confinement. Separately, we give a modular, syntax-directed static analysis for con- 
finement and show that it accepts some interesting example programs that embody 
important object-oriented design patterns. 

There are a number of ways in which abstractions can be expressed using con- 
structs of contemporary object-oriented languages, including modules, classes, local 
variables, object instances, not to mention heap structures such as object groups. 
We treat the most common situation: an instance of some class is viewed as repre- 
senting an abstraction, possibly using some other objects as part of its representa- 
tion. 

We are aware of no previous results on representation independence that address 
encapsulation of objects in the heap. Thus it is tempting to present the ideas 
in the setting of a simple idealized language, say a simple imperative language 
with pointers to mutable heap cells. But this would leave open some challenging 
issues, such as how class-based scoping rules fit with instance-based abstraction. 
We have chosen to consider a rich imperative object-oriented language with class- 
based visibility, inheritance and dynamic binding, type casts and tests, recursive 
types, and other features sufficient for programs that fit common design patterns 
such as observer and factory [Gamma et al. 1995]. 

Previous work on representation independence has been concerned with relating 
two versions of a component with respect to programs that use the component. 
But the designer of a class needs to consider not only users (the client interface) 
but also subclasses (the protected interface). This is a source of complication in our 
treatment of confinement and, to a lesser extent, in our treatment of representation 
independence. Our results consider replacement of one version of a class by another 
with the same public interface, in the context of arbitrary classes that use it or are 
subclasses of it. 

Overview and readmap. Sect. 2 introduces the language for which our results are 
proved and describes a simple example with which we review the formalization of 
representation independence using simulation relations. The example is extended 
to one showing how representation independence can be invalidated by leaked refer- 
ences to representation objects. The section concludes with an informal statement 
of our abstraction theorem. 

Sect. 3 discusses more elaborate examples that typify object-oriented programs. 
A version of a Meyer-Sieber [Meyer and Sieber 1988] example shows how higher 
order programs can be expressed. Versions of the observer pattern [Gamma et al. 
1995] illustrate challenges in formulating robust but practical notions of confine- 
ment. The section concludes with an informal description of our notion of ownership 
confinement. 

Sect. 4 formalizes the syntax and typing rules. Sect. 5 gives a surprisingly simple 
denotational semantics in the manner of Strachey [2000]. Confinement, the seman- 
tic notion, is defined formally in Sect. 6. Sect. 7 gives the first main result, an 
abstraction theorem for confined programs. Sect. 8 shows in detail how the the- 
orem applies to the examples in Sect. 3 and to further variations on the observer 
pattern. Sect. 9 considers examples of the interface between an owner class and its 
subclasses. To achieve a sufficiently flexible form of conflnement for subclasses of 
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the owner class, we add a simple module construct to the language. Sect. 10 proves 
a second abstraction theorem, for this extended language and for a generalized no- 
tion of simulation needed for owner subclasses. Sect. 11 wraps up the technical 
development by defining a static analysis for confinement that accepts the exam- 
ples of Sections 2, 3, 8, and 9; soundness with respect to (semantic) confinement is 
shown. Sect. 12 discusses related work and open challenges. 

Detailed proofs are given, as the complexity of similar languages has led to errors 
in published proofs, e.g., of type soundness. Appendices [to be put on line but not 
in print] give some additional proofs and executable Java code for all the examples. 

The organization of the paper is intended to make it possible for the casual 
reader to skip some technical material and still get the gist of the results. Readers 
who wish to study the details may still prefer to skip, on a first reading, material 
concerning object constructors and proofs that involve fixpoints and inheritance. 

Differences from the preliminary version. Outgoing references, from representa- 
tion objects to client objects, were disallowed in the preliminary version of this 
paper [Banerjee and Naumann 2002a]. We conjectured that they could be allowed 
if restricted to read-only access as in [Miiller and Poetzsch-Heffter 2000; Leino and 
Nelson 2002] . Here we allow them without restriction, as is needed to handle exam- 
ples such as the observer pattern where observers may well change state in response 
to events. We have also added constructors to the language, at the cost of some 
complexity in proofs due to the interdependence of semantics for commands and 
for constructors. The benefit is succinct formulation of an abstraction theorem 
sufficient for transparent application to realistic examples. The other major addi- 
tions are as follows: module-scoped methods, the generalized abstraction theorem, 
substantial worked examples, and the static analysis for confinement. 

In [Banerjee and Naumann 2002a] we discuss simulation proofs of the equivalence 
of "security passing style" [Wallach et al. 2000] with the lazy "stack inspection" 
implementation of Java's privilege-based access control mechanism [Gong 1999], 
and then extend our language to include access control. We give an abstraction 
theorem for this extended language. It was this study that led us to the main 
results but in retrospect it seems tangential and is omitted. 

2. REPRESENTATION INDEPENDENCE 

We begin this expository section with a very simple example of representation in- 
dependence, contrived mainly to introduce the Java-like language that we will use. 
Building on this example we show how pointer aliasing can invalidate representation- 
independence. We conclude with an informal statement of the main results. Sect. 3 
deals with more challenging examples including the observer pattern [Gamma et al. 
1995] and gives a more precise description of ownership confinement. 

2.1 A first example 

The concrete syntax for classes is based on that of Java [Arnold and Gosling 1998] 
but using more conventional notation for simple imperative constructs. Keywords 
are typeset in bold font and comments are preceded by double slash. A program 
consists of a collection of class declarations like the following one. 
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class Bool extends Object { 



bool f; 
con{ skip } 

unit set(bool x){ self.f :=x } 
bool get(){ result := self.f } 

} 



// private field 

// public constructor 
/ / public method 
/ / public method 



There arc two associated methods: set takes a boolean parameter and returns noth- 
ing; get takes no parameter and returns a boolean value. Methods are considered 
to be public, that is, visible to methods in all classes. (Module-scoped methods are 
added in Sect. 10.) Every method has a return type; the primitive type unit, with 
only a single value (it), corresponds to Java's "void" and is used for methods like 
set that are called only for their effect on state. 

Instances of class Bool have a field f of (primitive) type bool. A field f is accessed 
in an expression of the form e.f, and in particular self.f is used for fields of the 
current object; a bare identifier like x is either a parameter or a local variable. 
The distinguished variable result provides the return value; it is initialized with the 
default for its type {false for bool and nil for class types). Fields are considered 
to be private, that is, visible only to the methods declared in the class. Visibility 
is class-based, as in many mainstream object-oriented languages: an object can 
directly access the private fields of another object of the same class. 

When a new object is constructed, each field is initialized with the default value 
for its type. Them tlw^ constructor commands are executed: the constructors de- 
clared in superclasses are executed before the declared one which is designated 
by keyword con. We refrain from considering constructors with parameters. In 
subsequent examples we omit the constructor if it is skip. 

The observable behavior of a Bool object can be achieved using an alternate 
implementation in which the complement is stored in a field: 

class Bool extends Object { 
bool f; 

con{ self.f := true } 

unit set(bool x){ self.f := -ix } 

bool get(){ result := -.(self.f) } } 

We do not formalize class types ( "interfaces" in Java) separately from class declara- 
tions. Class names axe used as types and we use the term class loosely to mean the 
name of a declared class. But we are concerned with relating comparable versions 
of a class: as in the example above, a comparable version has the same name and 
methods with the same names and signatures. 

We claim that no client program using Bool can distinguish one implementation 
from the other; thus we are free to replace one by the other. Of course this is not 
the case if we consider aspects of client behavior such as real time or the size of 
object code — but these are not at the level of abstraction of source code. Moreover, 
input and output for end users is of some limited type like int or String. If a Bool 
could be output directly, say displayed in binary on the screen, then an end user 
could distinguish between the implementations. So we consider only clients that 
use Bool objects in temporary data structures and not as input or output data. 
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An example of such a client is method main in the following class. It declares 
a local variable b of type Bool, with scope beginning at the keyword in. In the 
absence of explicit braces, the scope of a local variable extends to the end of the 
method body. 

class Main extends Object { 
String inout; 

unit main(){ Bool b := new Bool in 

if . . .self.inout. . .then b.set(true) else b.set(false) fi; 
self.inout := convertToString(b.get()) } } 

We may consider method main as a main program for which the observable state 
consists of field inout. Its final value depends on some condition ". . .self.inout. . ." 
on its initial value. No object of type Bool is reachable in the state of a Main object 
after invocation of main, so there is no observable difference between its behavior 
using one implementation of Bool and its behavior using the other. 

The claim is that we need not consider specific clients; there is no use of Bool 
that can distinguish between the two implementations. The standard reasoning 
goes as follows. 

(1) Suppose o is an object of type Bool for the first implementation and o' an object 
for the second. The correspondence between their states is described by the 
basic coupling relation 

o.f=^(o'.f) . 

(2) This relation has the simulation property: 

— it holds initially (once the constructor has been executed), and 

— if the two versions of set (respectively, get) are executed from related states 
then the outcomes arc related. (As wc consider sequential programs, the 
outcome is the updated heap and the return value if any.) 

In short, the relation is established by the constructor and preserved by the 

methods of Bool. 

(3) To consider client programs wc must consider program states consisting of 
local variables (and parameters) along with the heap, which may contain many 
instances of Bool as well as other objects. For states, we define the induced 
coupling relation. Primitive values and locations are related by equality.^ A 
pair of heaps are related if there is a one-to-one correspondence between Bool 
objects such that they are pairwise related by the basic coupling of (1), and 
everything else is related by equality. 

The induced coupling relation is preserved by all commands in methods of all 
classes. This is the abstraction theorem. 

(4) For a pair of states related by the (induced) coupling, if no Bool objects are 
reachable then the states are equal. This fact, known as the identity extension 
lemma, holds by definition of the induced coupling. 

It is a consequence of (3) and (4) that the two implementations cannot be distin- 
guished by a client that does not input or output Bool objects. Any initial state 



^Later we refine this point. 
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for such a client is related to itself, by (4). We can consider an execution of the 
client using either of the two implementations of Bool; the final states are related, 
according to (3). And thus they are equal, by (4). 

Identity extension confirms that the chosen notion of coupling relation is suited 
to the chosen form of encapsulation. (Here, encapsulation means private fields and 
objects not input or output.) It is typically a straightforward consequence of the 
definitions. 

For program refinement, identity can be replaced by inequality in step (4). In 

this paper we do not emphasize refinement, but the requisite adaptation of our 
results is straightforward. For applications in program analysis, other relations are 
used in step (4), e.g., for secure information fiow the relation expresses equivalence 
from the point of low-security observers [Volpano et al. 1996].^ 

The abstraction theorem is a non-trivial property of the language. It would fail, 
for example, if the language had constructs that allowed client programs to read 
the private fields of Bool — or to enumerate the names of the private fields, or to 
query the number of boolean fields that are currently true. Such operations would 
be considered strange indeed. 

Familiar operations on pointers, however, can also violate abstraction. For exam- 
ple, with pointer arithmetic one can distinguish between two representations that 
differ only in the size of storage used (e.g., representing a boolean value using one 
bit of an integer versus one bit of a character). Even in the absence of pointer 
arithmetic, shared references lead to the following problem. 

2.2 Representation exposure 

Consider the following class OBool which provides functionality similar to that of 
Bool, in fact using Bool. For clarity we have chosen different method names, to 
emphasize that we are not comparing this class with Bool. 

class OBool extends Object { 
Bool g; 

unit init(){ self.g := new Bool; self.g.set(true) } 
unit setg(bool x){ self.g. set(x) } 
bool getg(){ result := self.g. get() } } 

To simplify the formal development, we sidestep the complicated interactions be- 
tween subclassing and method calls in constructors by confining attention to con- 
structors without parameters or method calls. In cases where this is inadequate, 
an ordinary method can be used (like in it in this example). 
Here is an alternate implementation of OBool. 



^Our formulation of the abstraction theorem can be appUed directly to prove command and class 
equivalences for a specific program. For applications of simulation in static analysis, the problem 
is usually to show that a syntax directed system of types and effects approximates some property 
like secure information flow, for all programs in a language. We have not attempted to formulate 
an abstraction theorem general enough to apply directly in such analyses; they use analysis-specific 
typing systems rather than the language's own types and syntax. But the essence of our result 
is that the language is rclationally parametric, given suitable confinement conditions. Indeed, 
in work subsequent to this paper, Banerjee and Naumann [2002b] use the same language and 
semantic model for a relational analysis of secure information flow. 
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class OBool extends Object { 
Bool g; 

unit init(){ self.g := new Bool; self.g.set(false) } 
unit setg(bool x){ self.g. set(-i x) } 
bool getg(){ result := -.(self.g.get()) } } 

To describe the connection between the two implementations a suitable basic cou- 
pling (recall (1) in Sect. 2.1) is the following relation between an object state o for 
the first implementation of OBool and o' for the alternate one: 

(o.g = mi = o'.g) V (o.g ^ ^ o'.g Ao.g.f = -.(o'.g.f)) . (*) 

If o and o' arc newly constructed, the first disjunct holds; method init establishes 
the second disjunct. Invocations of setg and getg maintain the relation: From 
related initial states, either both abort (due to dereferencing nil because init has 
not been called) or both terminate in related states. 

For these implementations, it is not just a private field that is to be encapsulated, 
but also the object referenced by that field. This is apparent in the coupling (*) 
which involves both. To describe the roles of the objects involved, we call class 
OBool an owner class. Its instances "own" objects of class Bool, their representation 
objects, which are called reps for short. Together, an owner and its reps constitute 
what we call an island (cf. Fig. 1), following Hogg [1991]. 

Here is a suitable client for OBool. 

class Main extends Object { 
String incut; 

unit main(){ OBool z := new OBool in z.init(); 

if . . .self.inout. . .then z.setg(true) else z.setg(false) fi; 
self.inout := convertToString(z.getg()) } } 

This does not distinguish between the two implementations of OBool nor does it 
violate the intended encapsulation boundary. 

Suppose we add to both versions of OBool the following method which "leaks" a 
reference to the rep object. 

Bool bad(){ result self.g } 

The method gives its caller an alias to the object pointed to by the private field 
g. This makes the location of the encapsulated object visible to clients. In and of 

itself, access to this location is not harmful.^ Like the other methods, method bad 
preserves (*). But a client class C can exploit the leak as in the following command. 

OBool z new OBool in z.init(); 

Bool w := z.bad() in if w.get() then skip else abort fi 



''To make this clear, one could assume that, for both versions of OBool, the Bool object is allocated 
at the same location. The assumption can be formalized by adding a conjunct o.g = o'.g to 
coupling (*) and assuming that method init preserves this equality. It is then preserved by all 
the methods of OBool including bad. Another justification is given in Sect. 10 where we show 
formally how the language is "parametric in locations" . 
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The command aborts if the new OBool is an object o' for the second implementation 
of OBool, but it does not abort for an object o for the first implementation. The 
client command preserves the relation (*), indeed it does not alter the state of the 
objects it accesses. But the relation is not the identity for the rep object states: we 
have o.g = o'.g but o.g.f is not equal to o'.g.f. So the relation is not the identity 
for the client to which the reps are visible. An attempt to argue using the steps in 
Sect. 2.1 breaks down because identity extension (4) fails. 

The abstraction theorem, step (3), can also fail. Consider the following client 
command. 

OBool z := new OBool in z.lnit(); Bool w := z.bad() in w.set(true) 

This does not preserve relation (*). To see why, suppose o, o' are a related pair 
of OBool objects assigned to z and satisfying (*). After the assignment to w, the 
effect of w.set(true) is to make o.g.f = o'.g.f, contrary to the relation (*). This is 
very different from the effect of z.setg(true). 

The examples show that both ingredients of representation independence — 
identity extension and preservation — can fail if a rep is leaked. The challenge is 
to confine pointers in a way that disallows harmful leaks and thus admits a robust 
representation independence property — without imposing impractical restrictions. 
The challenge is made more difficult by various features of Java-like languages, for 
example, type casts. We consider casts now; other challenges are deferred to Sect. 3. 

Suppose we change the return type for method bad, attempting to hide the type 
of the rep object. 

Object bad(){ result := self.g } 

Class Object is the root of the subclassing hierarchy so by subsumption it allows 
references to objects of any class. The client can use a (Bool) cast to assert that 
the result of z.bad() has type Bool. (In a state where the assertion is false, the cast 
would cause abortion.) 

OBool z := new OBool in z.init(); 

Bool w := (Bool)(z.bad()) in if w.get() then skip else abort fi 

Again, the client is dependent on representation. 

Note that the cast could not be used if the scope of class name Bool did not 
include the client. This suggests a focus on modules ("packages" in Java) for 
confinement of pointers, as has been studied by Vitek and Bokowski [2001] among 
others (see Sect. 12). But in our example the field has private scope, each rep is 
associated with a single owner, and the coupling relation is expressed in terms of 
a single owner. Our results account for this sort of instance-based encapsulation, 
which is common in practice and which is similar to the value-oriented notions used 
for representation independence in functional languages [Reynolds 1984; Mitchell 
1986; 1991]. 

2.3 Overview of results 

In the examples above, class OBool is viewed as providing an abstraction. It is 
just as sensible to consider Bool as providing an abstraction for which OBool is a 
client. We do not annotate programs with a fixed designation of owners and reps. 
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Rather, we study how to reason about a class, say Own, one has chosen to view as 
an abstraction with encapsulated representation. Objects of any subclass of Own 
are also considered to be owners. A second class, say Rep, is designated as the 
type of reps for Own. (In practice, Rep could be an interface or class type; this 
generalization is straightforward but would complicate the formalization.) 

A complete program is a closed collection of class declarations, called a class 
table. We consider an idealized Java-like language similar to the sequential frag- 
ment of C-|— I- (without pointer arithmetic), Modula-3, Oberon, C#, Eiffel, and 
other class-based languages. It includes subclassing and dynamic dispatch, class 
oriented visibility control, recursive types and methods, type casts and tests (Java's 
instanceof ), and a simple form of module. 

Roughly speaking, a class table CT is confined, for Own and Rep, if all of its 
methods preserve confinement. A confined heap is one where the objects can be 
partitioned into some owner islands (recall Fig. 1) along with a block of client 
objects as in Fig. 5. Furthermore, there are no references from clients to reps. (We 
use the term client for all objects except owners and reps.) 

Sect. 3 discusses confinement in more detail and the formal definitions are the 
subject of Section 6. The full significance of the definitions does not become clear 
until Sect. 9 where we study subclasses of Own: an object of such a type inherits 
the methods and private fields of Own, which manipulate reps. To be useful, owner 
subclasses must have some access to reps. On the other hand, full access cannot 
be granted; to do so would be to study not the class as unit of encapsulation but a 
class together with its subclasses, which would be revised in concert. 

Our objective is to compare versions of Own that may use different reps. We say 
CT and CT' arc comparable if they are identical except for having different versions 
of class Own, and those two versions declare the same public methods. The two 
versions of Own may well use different rep classes, say Rep and Rep' . Without loss 
of generality, our formalization has Rep and Rep' both present in CT and CT' . 

An interesting question is how to formalize basic couplings, step (1) of the proof 
method outlined in Sect. 2.1. To allow useful data structures, we need to allow 
representations to include pointers to client objects (e.g., elements of the queue in 
Fig. 1). But if the programmer is required to define a relation involving the state of 
objects outside the encapsulated data, how can this be done in a modular way? We 
have chosen to use relations on the encapsulated state only. Put differently: those 
things on which a coupling depends are considered as part of the island. Although 
other alternatives merit study, this one makes for transparent application of the 
formal results to interesting examples (this is done in Sects. 8 and 9). Moreover, it 
is straightforward to define the induced coupling. 

A basic coupling is a relation between a pair of owner islands for comparable CT 
and CT' . A simple example is given by (*) above in Sect. 2.2. More interesting is the 
observer example, discussed in Sect. 3, which uses a linked list of client objects (the 
observers). In Fig. 7 on page 44, a basic coupling is depicted in which the observer 
objects occur as dangling pointers from the corresponding islands. The point is 
that both versions are manipulating the same observer objects in the same way, 
including the invocation of methods on those objects. So the state of the observer 
objects is not relevant in the basic coupling — nor could it be, if the argument is to 
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be carried out in a modular way independent of the particular clients. 

In a related pair of islands, both owners have the same class, which may well be 
a proper subclass of Own. 

The induced coupling relation for heaps relates h to h' just if there are confining 
partitions for which corresponding islands are pairwise related by the basic coupling. 
Moreover, there is an exact correspondence between client objects in h and h' . 
Primitive values are related by equality. Locations are related by an arbitrary 
bijection. 

The induced relation is a simulation if it is preserved by the methods of class 
Own in CT and in CT' . A method declared in one version of Own may be inherited 
in the other version; it is the behavior of those methods that matters. 

The abstraction theorem says that a simulation is preserved by all methods of all 
classes, provided that both class tables are confined. The identity extension lemma 
says that the induced relation is the identity, after garbage collection, for client 
states in which no owners are reachable. 

Sect. 7 gives the formal definitions for couplings and simulation in the special case 
where locations of objects other than reps are related by equality. The abstraction 
and identity extension results are proved there in detail. Sect. 10 generalizes the 
definitions to allow an arbitrary bijection on locations; abstraction and identity 
extension are proved for the general case. The special case is of interest because it is 
adequate for some applications in program analysis (e.g., [Banerjec and Naumann 
2002b]) and for non-trivial examples like those of Sect. 3 (as shown in Sect. 8). 
Examples that require the general case are given in Sect. 9; they are subclasses 
of Own that construct reps and pass them to methods of Own as in the factory 
pattern [Gamma et al. 1995]. Notation is more complicated for the general case 
but the proofs are not very different from the special case. 

These results arc proved in terms of a semantic formulation of confinement; in- 
deed, the details of this formulation come directly from what is needed in the proofs. 
Sect. 11 gives a syntax-directed static analysis: typing rules that characterize safe 
programs and a proof that safety implies confinement (soundness). Our objective 
is to round out the story by showing how confinement can be achieved in prac- 
tice, not to give a definitive treatment of static analyses. But our analysis accepts 
many natural examples and the constraints are clearly motivated in the proof of 
soundness. The analysis is modular: It does not require code annotations and 
the only constraint it imposes on client programs is that they cannot manufacture 
representation objects. 

3. OWNERSHIP CONFINEMENT 

This section considers two substantial examples of representation-independence. 
The first is an object-oriented version of an example given by Meyer and Sieber 
[1988] as a challenge for semantics of Algol. It illustrates the expressiveness of 
object-oriented constructs, specifically the use of callbacks which go against the 
hierarchical calling structure which typifies the simplest forms of procedural and 
data abstraction. 

The second example is an instance of the observer pattern [Gamma et al. 1995] 
which is widely used in object-oriented programs. In addition to callbacks it involves 
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a non-trivial data structure and outgoing references from representation objects to 
clients. Note that we use the term client not just for objects that use an abstraction 
(by instantiating it or calling its methods) but for any objects except instances of 
the abstraction of interest or its encapsulated representation. 

The section concludes with an overview of our semantic notion of confinement. 

3.1 Callbacks 

Meyer and Sieber [1988] consider the following pair of Algol commands: 

var n := 0; P(n := n+2); if n mod 2 = then abort else skip fi (*) 

var n := 0; P(n := n+2); abort (t) 

Both invoke some procedure P, passing to it the command n := n+2 that acts on 
local variable n. (That is, P is passed a parameterless procedure whose calls have 
the effect n := n+2.) For any P, the commands are equivalent. The reason is that in 
the first example n is invariably even: P is declared somewhere not in the scope of 
n so the variable can only be affected by (possibly repeated) executions of n := n+2 
and this maintains the invariant. 

The difficulty in formalizing this argument is due to the difficulty of capturing 
the semantics of lexically scoped local variables and procedures in a language where 
local variables can be free in procedures that can be passed as arguments to other 
procedures. (It appears even more difficult, and remains an open problem, to cope 
with assignment of such procedures to variables [O'Hcarn and Tcnncnt 1997].) 

Now we consider a Java-like adaptation of the example, due to Peter O'Hearn. 
In place of local variable n it uses a private field g in a class A. Instead of passing 
the command n := n+2 as argument, an A-object passes a reference to itself; this 
gives access to a public method inc that adds 2 to the field. 

class A extends Object { 

int g; // (the default integer value is 0) 

unit callP(C y){ y.P(self); if self.g mod 2 = then abort else skip fi } 
unit inc(){ self.g := self.g + 2 } } 

In the context of this class and some declaration of class C with method P, the 
Algol command (*) corresponds to the command 

C y := new C in A x := new A in x.callP(y) (J) 

This aborts because after calling y.P, method callP aborts. The command (f) also 
corresponds to (J) but in the context of an alternative implementation of class A: 

class A extends Object { 

int g; 

unit callP(C y){ yP(self); abort } 
unit inc(){ self.g := self.g + 2 } } 

In Example 8.3, we use the abstraction theorem to prove equivalence of the two 
versions using coupling relation 

o.g = o'.g A o.g mod 2 = . 
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This relation is preserved by arbitrary P because P can affect the private field g 
only by calls to inc. 

As Reynolds [1978] shows (sec also [Rcddy 1998]), instance-based object-oriented 
constructs can be expressed in Algol-like languages, but the latter are in some ways 
significantly more powerful. The Java version of the example can be seen as giving 
an explicit closure to represent the command n := n+2 in the form of method inc. 
Indeed the simplicity of the semantic model for our language can be explained by 
saying the language is defunctionalized [Reynolds 1972; Banerjee et al. 2001] and 
lacks true higher order constructs. If the example is written in such a language, 
P ranges over more limited procedures than in Algol. The root problem for Algol 
semantics [Reynolds 1981b; O'Hearn and Tennent 1995] and proof rules [Olderog 
1983; German et al. 1989] is the interaction between arbitrary nesting of variable 
and procedure declarations and possibility of passing procedures as arguments. In 
imperative languages like C and Modula-3, procedures can be passed as arguments 
and even stored in variables, but only if their free variables are in outermost scope. 
This restriction greatly simplifies implementation of the language, and it suffices 
to admit simple but adequate semantic models.^ The constructs of a Java-like 
language offer similar expressive power and also admit simple models. 

The example also illustrates what are known as callbacks in object-oriented pro- 
grams. When an A-object invokes y.P(self) it passes a reference to itself, by which 
y may invoke a method on the A-object which is in the middle of executing method 
callP — a callback to A. If in (|) we replace x.callP(y) by x.callP(self), and assume 
that (I) is a constituent of a method of class C, then we get a callback to C. 

The point of the Algol example is modular reasoning about (*) and (f ) indepen- 
dent from the definition of P. For the object-oriented version we can also consider 
reasoning independent from subclasses of A. If instead of (|) we consider a method 

unit nn(C y, A x){ x.callP(y) } 

then there is the possibility that nn is passed an argument x of some subtype of A 
that overrides inc. By dynamic binding, the overriding implementation would be 
invoked by callP and our reasoning above would no longer be sound. For modular 
reasoning, we could require that any overriding declaration of inc must preserve 
the intended invariant that g is even. To impose such a requirement and a 
corresponding one for callP — is to require behavioral subclassing [Liskov and Wing 
1994; Dhara and Leavens 1996]. One important application of simulations is in the 
formalization of behavioral subclassing but that is beyond the scope of this paper. 

Unlike much work on reasoning about object-oriented programs, our results 
do not depend on behavioral subclassing. Representation independence holds for 
clients and abstractions that do not exhibit behavioral subclassing (see Sect. 9.2). 

3.2 The observer pattern 

In this subsection we consider variations on an often-used design known as the 
obscr\'cr pattern [Gamma et al. 1995] which involves a non-trivial recursive data 

^Naumann [2002] uses such a model to prove an abstraction theorem and apply it to Meyer- 
Sieber examples. The simpler of their examples can be proved directly in the model without use 
of simulations [Naumann 2001]. 
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class Observer extends Object { // "abstract class" to be overridden in clients 
unit notify(){ abort } } 

class Node extends Object { // rep for Observable 
Observer ob; 

Node nxt; // next node in list 
unit setOb(Observer o){ self.ob := o } 
unit setNext(Node n){ self.nxt:= n } 
Observer getOb(){ result := self.ob } 
Node getNext(){ result := self. nxt } } 

class Observable extends Object { // owner 
Node fst; // first node in list 

unit add(Observer ob){ Node n ;= new Node; n.setOb{ob); n.setNext(self.fst); self.fst ;= n } 

unit notifyAII(){ Node n := self.fst; while n ^ null do n.getOb().notify(); n := n.getNext() od } } 

Fig. 2. First version of observer pattern, in procedural style. 

structure using multiple rep objects and outgoing references to client objects. Fur- 
ther variations arc given in Sect. 8. 

Wc focus attention on the abstraction provided by an Observable object (some- 
times called the "subject"). It maintains a list of so-called observers to be notified 
when some event occurs. Its piiblic method add allows the addition of an observer 
object to the list. The public method notifyAII represents the event of interest; its 
efi^ect is to invoke method notify on each observer in the list. What notify does is 
not relevant, so long as it is confined.^' 

The abstraction involves a collection of objects, a well-worn example for data 
representations. Simple collections are essentially mutable sets of pointers to client 
objects. Testing whether a reference is in the set requires only pointer equality. To 
facilitate lookup by key, and to facilitate implementations like binary search trees, 
it may be necessary for the abstraction to invoke a comparison method on the client 
objects in the collection. This is similar to the call to notify in the observer pattern. 

In the first version of the observer example, Fig. 2, most of the work is done 
by the owner class Observable, which uses rep class Node to store observers in a 
singly linked list. A more object-oriented version appears in Fig. 8 of Sect. 8; it 
exemplifies the use of class-based visibility. 

Fig. 3 gives example client classes AnObserver and Main. Class AnObserver records 
notifications in its state. Method main constructs and initializes an Observable, 
installs an observer, and invokes notifyAII; upon termination, ob. count = I and no 
Observable is reachable. 

Fig. 4 gives another version of Observable, using a sentinel node [Gormen et al. 
1990], for the sake of an example. A more compelling use of sentinels is the version 
of Fig. 9 (in Sect. 8), which also uses subclassing and dynamic dispatch. 

In Sect. 8 we show equivalence of the versions of Figs. 2 and 4 as an application 
of the abstraction theorem and identity extension. The coupling relation describes 



^In Java, class Object declares methods notify and notifyAII. Here we assume that no superclass 
of Observer declares notify and no superclass of Observable declares notifyAII. In the Java versions 
of our examples we use different names. 
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class AnObserver extends Observer { 

int count; 

unit notify(){ self. count := self.count+1 } } 

class Main extends Object { 
AnObserver ob; 
unit main(){ 

ob := new AnObserver; Observable obi := new Observable; obl.add(ob); obl.notifyAII() } } 

Fig. 3. Example client for Observable. 

class Node2 extends Object { // rep for Observable 

Observer ob; 
Node2 nxt; 

unit setOb(Observer o){ self.ob := o } 
unit setNext(Node2 n){ self.nxt := n } 
Observer getOb(){ result := self.ob } 
Node2 getNext(){ result := self.nxt } } 
class Observable extends Object // owner { 
Node2 snt; // sentinel node pointing to list 
con{ self.snt ;= new Node2 } 
unit add(Observer ob){ 

Node2 n ;= new Node2; n.setOb(ob); n.setNext(self.snt.getNext()); self.snt.setNext(n); } 
unit notifyAII(){ 

Node2 n := self.snt.getNext(); while n ^ null do n.getOb().notify(); n := n.getNext() od } } 

Fig. 4. Version of observable that uses sentinel node, in procedural style 

the correspondence between a pair of lists, one with and one without a sentinel node 
(see Fig. 7). It is enough to say that the same Observer locations are stored in the 
lists, in the same order. The state of the Observer is not relevant — nor could it be 
in a modular treatment, as class Observer has no fields. To reason about outgoing 
calls, namely to notify, it is enough to show that the two implementations make the 
same calls. Those calls may lead to calls back to the Observable, but encapsulation 
ensures that those calls are the only way the behavior of notify can depend on, or 
affect, the Observable. 

Except for the bad method of Sect. 2.1, all of the examples discussed so far satisfy 
the confinement conditions discussed next. 

3.3 Confinement 

We need a notion of confinement to prevent representation exposures that invali- 
date simulation-based reasoning, as discussed in Sect. 2.1. A related issue is how to 
formulate simulation. In all the examples, our discussion centered on a correspond- 
ing pair of instances for two implementations of the owner class. In particular, the 
coupling relations are described for a pair of instances as discussed in Sect. 2.3. 
A class- or module-based notion of confinement might rule out leaks, but we aim 
for an instance-based notion of simulation suited to the kind of examples we have 
discussed. These involve an abstraction provided by a single instance (the owner 
object) using a representation accessed via its private fields. So we need to prevent 
problematic sharing not only between client and owner but also between different 
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instances of the owner class. 

Fig. 5 illustrates instance-based owner confinement; in this case Nodes are con- 
fined to their owning Observable. Following Hogg [1991], we use the term island for 
the sub- heap consisting of an owner and its reps. Dashed lines in the Figure depict 
two islands. Our notion of owner confinement imposes four conditions on islands; 
here are the first three: 

(1) there are no references from a client object to a rep; 

(2) there are no references from an owner to reps in a different island; 

(3) there are no references from a rep into a different island. 

The Figure exhibits most allowed references, but we also allow an owner to reference 
another owner (see Fig. 6 on page 33). An example is given in Sect. 9.1. Note that 
heap confinement is a state predicate. The full definition, formalized in Sect. 6, 
deals with preservation of this predicate by commands and also with leaks via 
parameter passing in outgoing method calls from island to client. 

In class-based languages with inheritance, there is a subclass (or "protected") 
interface in addition to the public one. This raises the possibility of expressing 
encapsulation of reps for not only (instances of) the owner class but also its sub- 
classes. We have chosen the alternative that subclasses are like clients in that fields 
they declare may not point to reps. To the list of conditions above we add: 

(4) references from an owner's fields to its reps are only in the private fields of the 
owner class. 

In order not to abandon the expressiveness of subclassing, however, we allow sub- 
class methods to manipulate reps: they may be constructed, stored in local vari- 
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ables, and passed to the owner. This fits weU with the factory pattern [Gamma 
et al. 1995] which allows owner behavior to be adapted in owner subclasses with- 
out violating encapsulation. To balance the paper, we have deferred the relevant 
examples to Sect. 8. 

Confinement is formulated using class names. Two incomparable class names, 

Own and Rep, are designated. An object is considered to be an owner (respectively, 
a rep) if its type is Own (resp. Rep) or a subtype thereof. Incomparability is a mild 
restriction that enforces a widely-followed discipline of distinguishing between rep 
objects (e.g., nodes in a linked list) and objects representing abstractions (e.g., a 
list). The technical benefit of incomparability is that if C and D are incomparable, 
which we write C ^ D, then an expression of type C never has a value of type D. 

We aim for a perspicuous separation between the semantic property needed for 
the abstraction theorem and the syntactic conditions used for static analysis. The 
"semantic" property in fact includes conditions on method signatures. For exam- 
ple, we impose the restriction that the return type of a public owner method is 
incomparable to Rep; this disallows method bad of Sect. 2.1. 

Our use of types to formulate alias restrictions allows heterogeneous data struc- 
tures, but is slightly restrictive in that there is a single common superclass for all 
reps. For more flexibility in practical applications, our theory could be adapted by 
taking Own and Rep to be "class types" ("interfaces" in Java), rather than class 
implementations. The generalization is straightforward and not illuminating. 

The more substantial restriction is due to the fact that class Object is com- 
parable to all classes. Because Java lacks parametric polymorphism. Object is 
often used to express generics, e.g., a list containing elements of arbitrary type. A 
method to enumerate the list would have return type Object, which violates our 
restriction on owner methods. This restriction could be dropped in favor of more 
sophisticated conditions to ensure that no rep is returned (see Sect. 12). But in 
practice many generics have some sort of constraint expressed by a class or inter- 
face type — like Observer in our examples, or Comparable for data structures that 
depend on an ordering. These do not run afoul of our restriction. In any case, the 
use of Object for generics is widely deplored because it undercuts the benefits of 
typing; parametric types are clearly preferable. 

Some works on confinement have considered all the confinement properties in- 
tended to be satisfied by a program, using hierarchical notions of ownership [Clarke 
et al. 2001; Miillcr 2002]. For example, a Set could own the header of a list which in 
turn owns the nodes of the list. This is not necessary for our purposes (sec Sect. 12). 
To analyse the abstraction provided by the set, we would consider both the header 
and nodes to be reps, with a common superclass Rep. On the other hand, to replace 
one header implementation by another. Set is irrelevant; we choose Own to be the 
header and Rep for the nodes. 

4. SYNTAX 

This section formalizes the language, for which purpose we adapt some notations 
from Featherweight Java [Igarashi et al. 2001].^ To avoid burdening the reader with 
straightforward technicalities we deliberately confuse surface syntax with abstract 



'^But the languages differ, e.g., ours has imperative features and private fields. 
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syntax. We do not distinguish between classes and class types. We confuse syntactic 
categories with names of their typical elements. Barred identifiers like T indicate 
finite lists, e.g., T f stands for a list / of field names with corresponding types T. 
The bar has no semantic import; T has nothing to do with T. 

The grammar is based on given sets of class names (with typical element C), field 
names (/), method names (m), and names (x) for parameters and local variables. 
In most respects self and result are like any other variables but self cannot be the 
target of assignment. 



Grammar 



T : 


:= bool unit | C 


data type 


CL: 


:= class C extends C {TJ; con{ S}M} 


class declaration 


M : 


:= Tm{Tx) {S} 


method declaration 


S : 


:= a; := e 1 e.f := e 


assign to variable, to field 




X := new C 


object construction 




X := e.m(e) x := super.m(e) 


method calls 




1 Tx:=ein5 


local variable block 




1 if e then S else 5 fi | 5; 5 


conditional, sequence 


e : 


:= X \ null true false | it 


variable, constant 




1 e.f\e = e 


field access, equality test 




' c is r 1 (C) c 


typv test, cast 



Without formalizing it precisely, we assume there is a class Object with no fields 
or methods which can be used as a superclass. Additional primitive types, such as 
integers, can be treated in the same way as bool and unit (integers can also be 
represented, e.g., in unary using linked lists). 

In the formal language, expressions do not have side effects. Object construc- 
tion, new, occurs only as a command x := new C that assigns to a local variable. 
Method calls are not expressions but rather occur in special assignments x := e.m(e) 
to allow both heap effects and a return value. 

Remark 4.1 {syntactic sugar) In examples we use several abbreviations: 

— A method call command e.m(e), e.g., self.g.set(true), abbreviates a call assigning 

to an otherwise unused local variable. 
— Assignment of a new object to a field abbreviates a local block assigning the new 

object to a variable that is then assigned to the field. 
— Methods that return values but do not mTitatc state are used in expressions, e.g., 

the argument in self.inout := convertToString(z.getg()) and the target object in 

n.getOb().notify(). These are easily desugared using fresh variables and suitable 

assignments. 

As the language has general recursion, we omit loops. For desugaring loops it would 
be convenient to have local or private method declarations, but the module-scoped 
methods added in Sect. 10 suffice. The issue is discussed in Sect. 8.1. □ 
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A program is given as a class table CT, a finite partial function sending class 
name C to its declaration CT{C) which may make mutually recursive references to 
other classes. Well formed class tables arc characterized using typing rules which 
are expressed using some auxiliary functions that in turn depend on the class table, 
as is needed to allow mutual recursion. Consider a declaration 

CT{C) = class C extends D {TiJ; con{ Si}M} . 

To refer to the constructor, we define constrC = Si. For the direct superclass of 
C, we define super C = D. Let M be in the list M of method declarations, with 

M = T m(T2 x) {S2} . 

We record the typing information by defining mtype{m, C) = T2^T. (Note that 
T2— >T is not a data type in the language.) The parameter names are given by 
pars(m, C) =x. If m has no declaration in CT{C) but mtype{m, D) is defined then 
m is an inherited method: we define mtype{m, C) = mtype{m, D) and pars{m, C) = 
pars{m, D). For the declared fields, we define type{f,C) = Ti and dfieldsC = 
{f -.Ti). Here f :Ti denotes a finite mapping of field names to types. To include 
inherited fields, we de&ne fields C = dfields CU fields D and assume / is disjoint from 
the names in fields D. The built-in class Object has no methods and /ie/ds(Object) 
is the empty list. 

A typing context F is a finite mapping from variable and parameter names to 

data types, such that self £ domT. Whereas the Java format T x is used in code to 
give X type T, it is written x:T in typing contexts. Typing of commands for methods 
declared in class C is expressed using judgements F h S' where F self = C. Moreover, 
if ■mtype{m, C) = T—^T then Tx = T and F result = T.^ For brevity, we sometimes 
say "command" to refer to a derivable judgement F h 5*. The judgement F h e : T 
says that expression e has type T. The constructor is typed using a judgement 
self :C \- S : con which is distinguished from the typing of S' as a command, as the 
former is used to define the semantics of >S as a constructor, which in turn is used 
in the semantics of object construction (new). 

Definition 4.2 (subtyping , <) The class table determines a subtyping relation 

< as follows. If T or ?7 is bool or unit then define T < U iS T = U. For class 
types C and £>, define C < D iS cither C = D or super C < D. □ 

Subsumption is built into the rules for specific constructs. For example, the assign- 
ment rule allows x:D,y:E, self: C \- x — y provided that E < D. 

The constructor for one class may construct objects of other classes (Fig. 4 is 
an example). But we prefer not to model divergence due to cyclic constructor 
dependencies as in the following. 

class B extends Object { B f; con{ self / := new C} } 
class C extends B { con{ skip } } 

(Recall that to initialize a C object both the B- and C-constructor are applied.) 



*In [Banerjee and Naumann 2002a] we make C an explicit, and redundant, part of the judgement, 
and we use separate return statements rather than variable result. 
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Definition 4.3 (constructor dependence, c) For B,C ranging over declared 
classes, we say that C has constructor dependence on B, written B \z C, iS B \z 
{super C) or x := new B occurs in constrC, for some x. □ 

Note that B n C just if the constructor of C or one of its ancestor classes contains 
new B (by which we mean x := new B for some x). The transitive closure IZ+ has 
B [1+ C just if construction of a C-object entails construction of a i?-object. For 
the example above we have C \Z B and C C C. 

Definition 4.4 (well formed class table) A class table is well formed provided 
it satisfies the following conditions. 

— Each class declaration class C extends D {T f; con{ 5* } M } is well formed, 
that is, each method declaration M in M is well formed, and self :C\-S: con, 
according to the rules to follow. 

— If C occurs as the type of a field or parameter in some class then CT{C) is 
defined. No field or method has multiple declarations in a class. 

— The subclass relation < is antisymmetric. 

— Transitive constructor dependence, IZ^, is antisymmetric and irrefiexive. □ 

The rules arc straightforward renderings of the typing rules for Java, for private 
fields, public methods and public classes [Arnold and Gosling 1998]. 

Typing of constructors 



S = constrC self: C \- S no method calls occur in S 
self :C 1-5: con 



Typing of method declarations 



x:T, self:C, result :Th 5 
mtype{m, superC) is undefined or equals T^T 
pars{m, super C) is undefined or equals x 

C\-Tm{Tx){S} 



In this method rule, the condition on mtype is the standard invariance restriction 
on method types, as in Java [Arnold and Gosling 1998; Abadi and Cardelli 1996]. 
The last antecedent in the rule, concerning pars{m, D), ensures that all declarations 
of a method use the same parameter names. This loses no generality and slightly 
streamlines the formalization of the semantic domains in the sequel. 
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Typing of expressions 

ri-a;:ra; ThnulhS T hit: unit ri-true:booI T h false : bool 

r h ei : Ti r h 62 : T2 T h e : (F self) (/ : T) £ dfieldsjT self) 

rh ei = 62 : bool ri-6./:T 

ri-6:£) ■B<£) ri-6:-D B<D 
T\-{B)e:B TheisB: bool 



The rule for equality test allows comparison of arbitrary data types, and is ref- 
erence equality in the case of class types. But if 6i and 62 have types not related 
by <, the test 61 = 62 is false except when both are nil. The rule for field access 
enforces private visibility: only a method declaration in class C can access fields 
declared in CT{C). It can access those fields on any object of its type; to access 
its own fields the expression is self./. The rule for cast is standard.^ 

Typing of commands 



rh 61 : (r self) (/ : T) e dfields{T self) 

rh 6:T r<ra; a; 7^ self F h 62 : U <T 

rhx:=6 r 1-61./:= 62 

T \- e:D mtype{m, D) = T^T mtype{m, super{T self)) = T^T 

T\-e:U U <T a; ^ self T<Tx rhe:F U <T a; ^ self T<Tx 
r h a; := e.m{e) F h a; := super. m(e) 

B < r.t; ^ self B ^ Object T h Si Y h S2 

r I- a: := new B T h ^i; S2 

The: bool T h Si T h S2 TheiU U <T a; ^ self (r,a;:T)l-5 

r h if 6 then else 52 fi T h T a; := 6 in 5 



The command ruk;s have hypotheses involving partial functions which must be 
defined for the hypothesis to be satisfied. For example, in the rule for super calls, 
mtype{m, superC) must be defined and equal to T^T. 

Each expression and command construct is the conclusion of exactly one typing 
rule, and there are no other rules. Thus we have the following. 

Lemma 4.5 A typing FI-5orFI-6:T has at most one derivation. □ 



®It is not adequate for expressions that arise through substitutions used in program logic (see Cav- 
alcanti and Naumann [1999]) and in small-step semantics (sec Igarashi ct al. [2001]); the latter 
source uses the term "stupid cast" for the typing rule that allows {B) e when B is not a subclass 
of the static type of e. 
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Definition 4.6 (inheritance) Method m is inherited in C from B if C < B, 
there is a declaration for m in B, and there is no declaration for m in any D such 
that C < D < B. To make the class table explicit, we also say m is inherited from 

B in CT{C). □ 

Because the language has single inheritance, the subtyping relation < is a tree: if 
D < B and D < C then B < C ov C < B. If mtype{m, C) is defined for some C 
then it is defined for all subclasses of C. For a given method name m and class C, 
there is a unique ancestor class declaring m that is least with respect to <. 

Lemma 4.5 allows proofs by structural induction on typings. The following notion 
facilitates induction on inheritance chains. 

Definition 4.7 (method depth) For any m and C such that ■mtype{m, C) is 
defined, the method depth of C for m in CT is defined by depth(m, C) = 1 + 
depth{m, superC) if mtype{m, superC) is defined; otherwise, depth{m, C) = 0. □ 

An immediate consequence is that if mtype{m, C) is defined and depth{m, C) = 
then CT{C) has a declaration for m. 

Finally, we consider ramifications of constructor dependence. Note that Object l/l 
C for all C, by the typing rule for new. 

Definition 4.8 (semantic dependence, <c) As an auxiliary notation, we define 
B :< C iS {D \ D n+ B} C {D \ D \Z+ C} and write B ^ C if this inclusion is 
proper. For classes B, C declared in the class table, define -B C iff -B ^ C or 
both B ;<C and B > C. □ 

Lemma 4.9 For a well formed class table we have the following. 

(1) <C is well founded. 

(2) superC < C for all C. 

(3) B\zC implies B <C C for all B and C. 

Proof. Note that ;^ is a preorder but not antisymmetric, so ^ is not a lexico- 
graphic order per se. To prove (1), define depsC = {D \ D [l+ C} for any C. Then 
we have B < C iff {deps B, B) < {deps C, C), where < is defined by {X, B) < (Y, C) 
iS X CY or X CY and B > C (where C means proper subset). This is logically 
equivalent to: X CY or X = Y and B > C, which shows that the definition is the 
lexicographic coupling of C and >. As C here is for finite subsets of declared class 
names, both C and > are well founded, hence so is their lexicographic coupling. 

For (2), if D c+ superC then D c+ C by definition of C; hence superC ^ C. 
Also, superC > C, so (2) holds by definition of 

For (3), suppose BrC. Then, by transitivity, {D \ D r+ B} C {D \ D r+ C}. 
Also, we have B c+ C but B B, by well formedness of the class table, so the 
inclusion is proper. That is, B -< C, whence B <C C by definition of □ 

5. SEMANTICS 

This section defines the semantic domains, then the semantics of expressions and 
commands, and finally the semantics of well formed class tables. 

Because methods are associated with classes rather than with instances, the 
semantic domains are rather simple. There are no recursive domain equations 
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to be solved: subclassing (<) is acyclic and the cycle of recursive references via 
class fields is broken via the heap. Mutually recursive method invocations can arise 
through direct calls on a single object and also through callbacks between reachable 
objects, as for example in the observer pattern. We impose no restrictions on such 
calls. A fixpoint construction is used for the method environment which comprises 
the semantics of the class table. 

The interdependence between constructors and object construction commands 
(new) is a bit complex; things pertaining to constructors may be skipped on first 
reading. As a way of explaining the fine points, we prove in some detail that the 
semantics is well defined (Lemma 5.7). 

Often we write = between expressions involving partial functions such as those 
used in typing. Unless otherwise indicated, it means strong equality: both sides 
arc defined and equal. 

5.1 Semantic domains 

The state of a method in execution is comprised of a heap h, which is a finite^" 
partial function from locations to object states, and a store r], which assigns loca- 
tions and primitive values to the local variables and parameters given by a typing 
context r.^^ An object state is a mapping from field names to values. Function 
application associates to the left, so h£f is the value of field / of the object hi at 
location £. 

A command denotes a function mapping each initial state {h, rj) either to a final 
state {ho,r]o) or to the distinguished value _L. We use the term global state for 
{h,ri), to distinguish it from object states. The improper value ± represents non- 
termination as well as runtime errors: attempts to dereference nil or cast a location 
to a type it does not have. 

In some languages it is a runtime error to dereference a dangling pointer, i.e., 
one not in the domain of the heap. In Java dangling pointers cannot arise: there 
is no command for deallocation and a correct garbage collector never deallocates 
reachable objects. For our purposes, garbage collection need not be modelled. 
Commands act on heaps and stores that are closed in the sense that all locations 
that occur are in the domain of the heap. The following paragraphs formalize our 
assumptions about locations and then define the semantic domains. 

For locations, we assume that a countable set Loc is given, along with a dis- 
tinguished value nil not in Loc. To track each object's class we assume given a 
function loctype : Loc ClassNames such that for each C there are infinitely many 
locations £ with loctype £ = C. We use the term heap for any partial function h 
such that domh Cy;„ Loc and each h£ is an object state of type loctype £. Object 
states are formalized later. Because the domain of a heap is finite, the assumption 
about loctype ensures an adequate supply of fresh locations. 

Definition 5.1 (allocator, parametric) An allocator is a location-valued func- 
tion fresh such that locty pe {fresh {C,h)) = C and fresh{C,h) domh, for all 



'^"The preliminary version [Banerjee and Naumann 2002a] of this paper has a bug: infinite heaps 
are allowed, and it is not required that there be unallocated locations at every type. 
^^In [Banerjee and Naumann 2002a] we use the term "environment" for rj, wishing to avoid the 
irrelevant connotations of "stack"; here we use "store", following Reynolds [2001]. 
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C, h. An allocator is parametric if dom hi n Iocs C = dom /12 H Iocs C implies 
fresh{C, hi) = fresh{C, /12). □ 

For example, taking hoc = N, a parametric allocator is given by the function 
fresh{C, h) — ■min{i \ loctype 1 = C t\l ^ dom h'\. 

Typical implementations encode the object class as part of its state. One could 
uncurry this representation of heaps and take hoc to be N x ClassNames. Then 
fresh{C, h) could return (n, C) where n is the least address of an unused memory 
segment of sufficient size for the state of C. This is an allocator but not parametric 
because the presence of objects of one class affect the availability of memory for 
objects of other classes. 

We define the semantics in terms of an arbitrary allocator fresh. The assump- 
tion of parametricity is stated explicitly where it is needed, namely for the first 
abstraction theorem (Sect. 7) but not the second (Sect. 10). Parametricity of the 
allocator is a reasonable assumption for some applications but not all. The as- 
sumption streamlines the proof of the abstraction theorem, allowing us to highlight 
other issues. 

In addition to heaps, it is convenient to name a number of other semantic cate- 
gories that are explained in due course. 

Semantic categories 

I 1 

e ::= T I r I state C \ Heap \ Heap O T \ Heap i»T\0±\C,x, T^T \ MEnv 



In order to define the more complicated semantic domains, we need to define 
closed stores. Stores are among the simpler semantic domains, which are defined 
as follows. 

Semantics of types, object states, and stores 

I 1 



We write lacs C for G hoc \ loctype i = C}, and locs{C[) for {d \ loctype t < C}. 
There is no independent meaning for CJ,. As small dot has another use, we use the 
fat dot • to separate a bound variable from its scope. Note that |r] is defined for 
r both with and without result in its domain. 

Definition 5.2 (closed heap and store) A heap h is closed, written closed h, 
iff rng{ht) fl Lac C domh, for all £ e domh. A store 77 € |r] is closed in heap h, 
written closed{h, rj), iff rng rj fl Loc C dom h. □ 



[booll 
[unit] 

(state Cj 



{true, false} 
{nil} U locs{Ci) 

{s I dom s = dom{fields C) A V(/ : T) G fields C • s/ G |T]} 
{rj I dom, rj = dom T Arj self =^ nil A V.t G dom 7] • rj x E [F xj} 
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Recall that fresh locations should occur nowhere in the global state. For a closed 
store and heap, this follows from the requirement that fresh{C, h) ^ domh}'^ 

Semantics of global states and methods 



{Heap} = I dom h Loc A closed h A e dam h» hi G [state {loctype 

{Heap or] = {{h,r]) \ h G [Heap] A r? e |r] A dosed{h, rj)} 

{Heap (g) T] = {{h,d)\h£ {Heap} A d ^ fT] A {d E Loc => d G dom h)} 

|6'_l] = U ± (where ± is some fresh value not in {ej) 

|C, X, T^T] = [Heap (g) : T, self : C)] ^ ({Heap ® r)_Ll 

|M£'nt;] = {fi \ VC, m • /xCm is defined iff mtype{m, C) is defined, 

and fiCm G \C,pars{m, C), mtype{m, C)] if /xCm defined } 



Just as a class declaration CT{C) gives a collection of method declarations, the 
semantics of a class table is a method environment that assigns to each class C a 
method meaning fiCm for each m declared or inherited in C. 

For the fixpoint construction of the method environment denoted by a class 
table, we need to impose order on the semantic domains. We use the term com- 
plete partial order for a poset with least upper bounds of countable ascending 
chains [Davey and Priestley 1990]. The degenerate case is ordering by equality, 
which is the order we use for the semantics of T, T, stateC, Heap, {Heap (g F), 
and {Heap (8> T). Then {{Heap (g) r)j_] and [{Heap T)j_] are complete partial or- 
ders with the "flat" order: ± is below anything and other comparable elements 
are equal. The set [C, x, T— >T] is defined to be the space of total functions 
[Heap <Si {x: T, self :C)} [{Heap <SiT)±l, all of which are continuous because 
Heap (g) {x:T, self : C) is ordered by equality. The function space itself is ordered 
pointwise, making it a complete partial order with minimum element A(/i, 77) • _L. 
Finally, we order [MEnv] pointwise. All method environments fj, in IMBni!] have 
the same domain, determined by CT, so this is also a complete partial order, taken 
pointwise. It has a minimum element, namely AC • Xm • A(/i, 77) • _L. 

Whereas [state C] consists of the states for objects of exactly class C, the set 
[Cj is downward closed. For data types Ti, we have Ti < ^ [Tij C [T2]. 

Definition 5.3 (incomparable, ^) We write C ^ B for C ^ B A C ^ B. For a 
list C,C^B means C ^ B for all C in C. □ 

Lemma 5.4 For classes C,B,ifC ^ B then [Cj n [Bj = {nil}. For primitive T 
we have [Tj D [B] = 0. □ 

The result is a direct consequence of the definitions. We often use the contrapositive: 
if there is a non-nil location in both [BJ and [Cj then B <C ov C < B. 



^^If dangling pointers were allowed, the definition of freshness would need to be with respect to 
both the store and all object states in the heap. The issue becomes apparent in the proof of 
Lemma 6.16 in the sequel, which uses closure. Most of the other definitions and results can be 
formulated without restricting heaps to be closed, so we mistakenly neglected closure in [Banerjee 
and Naumann 2002a]. 
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5.2 Semantics of expressions, commands, constructors and methods 

For expressions and commands, the semantics is defined by induction on typing 
derivations. As a consequence of uniqueness of typing derivations, Lemma 4.5, the 
semantics is a function of typings. The meaning of a command F I- iS will be defined 
to be a function 



|F h 5] G [MEnv] [Heap (g) F] ^ l{Heap O F)_l1 . 
The meaning of an expression F h e : T will be defined to be a function 



lT\-e:T]GlHeap^Tl^lTj} 

such that the result value is always in the domain of the heap if it is a location. -"^^ 
This is part of Lemma 5.7, the proof of which serves as an exposition for some 
details of the semantic definitions. 

The command and expression constructs are strict in i., except, as usual, for the 
then- and else-commands in if — fi. To streamline the treatment of _L in the semantic 
definitions we use a metalanguage construct which some readers will recognize as the 
bind operation of the lifting monad [Moggi 1991]. The construct let d = Ei in E2 
has the following meaning: If the value of Ei is _L then that is the value of the 
entire let expression; otherwise, its value is the value of E2 with d bound to the 
value of El . 

We let (h, rf) G \Heap ® F] in the following definitions. Identifiers arc as in 
the corresponding typing rules. For semantic values we use the identifier d, but 
sometimes I for elements of the sets |C]. 

For expressions the semantics is straightforward; we choose the Java semantics 
for casts and tests. 



Semantics of expressions 



^^We have chosen a simple but sUghtly inelegant formulation. We express closure of the result 
for commands in the semantic domain whereas for expressions there is no returned heap and 
we express closure as a property of the semantic function. The presentation could be made 
more elegant by introducing categories exp(r, T) and com(r) with |com(r)] = IMEnv} — > 
{Heap ^T} —* l{Heap r)x] and imposing the restriction on return values in the definition of 
[exp(r,T)] as a subset of [ffeap (g) F] P'±l- We could even restrict the meanings to those 
that are confined, but the gain in elegance would come at the expense of complexity that not 
all readers would find illuminating. Wc have chosen to treat confinement and paramctricity as 
properties to be proved after the semantics is defined, downplaying the model as an independent 
structure. Thus little would be gained by naming categories exp(r, T) and com(r). 
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fr 


h 


x:Ti(h,r]) 


= rix 


ir 


h 


imll:Bj{h,r]) 


= nil 


ir 


h 


it : unit] (ft,, rj) 


= it 


ir 


h 


true : bool](ft, rj) 


= true 


ir 


h 


false : bool](ft, rj) 


= false 


ir 


h 


ei = 62 : bool](/i, 77) 


= \etd 








let d: 








if di 


ir 


h 


e./:Tl(/i,77) 


= let£ 



= [rhei:Til(/i,r?) in 

= [rhe2:T2l(/i,r?) in 

^2 then irzte else false 

|r[-e:(rself)](/i,r/) in 

if £ = nj? then _L else h£f 
IT\- {B) e:Bj{h,r]) = let ^ = [F h e : £>l(/i, r?) in 

if £ = nil V loctype £< B then £ else _L 

|ri-eisB:bool](/i,r?) = let £ = |r h e : Z^Kft, 77) in 

if I 7^ nil A loctype £ < B then irae else /afee 



The semantics of commands is defined by structural induction on the command, 
except for object construction x := new C which also depends on the constructor 
semantics of the constructor, constr C, of C. That in turn depends on the con- 
structor of super C, and on the command semantics of constr C. Well foundedness 
of this dependence is part of the proof of Lemma 5.7. 

In the semantics of commands, we write fields B 1— > defaults as an abbreviation 
for the function sending each / e dom{fields B) to the default value for type{f, B). 
The defaults are false for bool, it for unit, and nil for classes. Function update or 
extension is written, e.g., [rj \ x^d\. We write [ for domain restriction: if x is in 
the domain of r] then r] [x \s the function like but without x in its domain. 



Semantics of commands 
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lr\- x:=e]ii{h,T]) = \etd=lr\-e:T]{h,T])\n{h,[T]\x^d]) 

lrhei./:=e2lM(/i,r?) = let ^ = [F h ei : (rself)l(/i, r?) in 

if £ = nil then _L else 

letrf= |rhe2:C/l(/i,7?) in 

{[h\£^[h£\f^d]],r,) 
IT\- x:=new Bjfi{h,r]) = let I = fresh{B , h) \n 

let hi = [h\ i>-^\fieldsB defaults]] in 

let rji = [self ^ £] in 

let ho = [self :i3 h constrB:conjiJ,{hi,r]i) in 
{ho, [r] I a; 

lrha;:=e.m(e)]At(/i,r?) = let f = [F h e : £>I(/i, r?) in 

if £ = ml then _L else 
let X = pars(m, D) in 
let rf= lT^e:U\{h,ri) in 
let ryi = I-+ cZ, self h-* £] in 
let = fj.{loctype i)m{h,r]i) in 

(/ii, [77 I xi-^di]) 

|r h a; := super.m(e)]/Lt(/i, ?7) = let £ = ryself in 

let X = pars{m, T self) in 

\etd=lT\-e:Uj{h,'n) in 

let r]i = [x d, self >->■ £] in 

let (/iijrfi) = /i(.sMper(rself))m(/i, in 

(hi, [r] I XH^rfi]) 

[r h 5i; S2Mh, 7j) = let (/ii, = [r h SiMh, 7j) in 

[rh52lM(/ii,r?i) 

[r h if e then else ^2 fil/u(/i, rj) = let 6 = |r I- e : bool](/i, 77) in 

if b then |r h S'i]M(/i, ??) else |r h S'2]/x(/i, r?) 

|r h T X := e in 5]a<(/i, ry) = let = |r h e : ;7](/i, r/) in 

let 771 = [77 I XH- >d] in 
let (/ii,7?2) = I(r,a;:T) h in 
(/ii, (7?2La;)) 



Method calls of the form x := e.mie) are dynamically bound: the method mean- 
ing is determined by loctype£ in the semantic definition, where £ is the value of 
e. By typing, loctype£ < D and pars{m, loctype £) = pars{m, D). Super-calls are 
statically bound: the method meaning used, fi{superC)m, is determined by the 
static class C. Note that if mtype{m, superC) is defined, as required by the typing 
rule, then pars{m, C) = pars{m, superC). 

The meaning of a command S as a. constructor is a function 

[self : C h S : con] G {MEnvj {Heap self : ^ {Heap J . 
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Dependence on iMEnv] is a formal technicality; the semantic definition uses the 
command semantics of S, but the typing rule disallows method calls in S. 

Semantics of constructor 

I 1 

[self : C I- S : con]/i(ft, t]) = \et B = superC in 

let 5*0 = constrB in 
let hi= if B ^ Object 

then [self : B h S'g : con]/i(/i, if) else h in 
let (/io,-) = Iself:Ch5Hfti,?7) in 
ho 



Note that if [self : _B h 5*0 : con]/i(/i,ry) or [self : C h S']/i(/ii, 77) is _L then so is 
[self: C ^ S : con]/i(ft,, ry). The result _L is possible due to nil dereferences and cast 
failures but not divergence (because there are no method calls or cyclic constructor 
dependencies). 

Semantics of method declaration 

Suppose M is a method declaration in CT{C), with M = T m{T x){S}. Its 
meaning [M] is the total function {MEnvj lC,x,T^T} defined by 

lMlii{h,r]) = let r?i = [r] \ result 1-^ default] in 

let {hQ,r]o) = [x :T, self :C, result :T\- SliJ.{h,r]i) in 
{ho,T]o result) 



For precision in the semantics of a method inherited in C from B we make an 
explicit definition for the domain-restriction of a method meaning in T— »-T] 
to the global states (h,ri) in iHeap (X) a; : T, self : C]. 

Definition 5.5 (restr) For d £ T— >T] and C < B, define restr{d,C), an 
element of [C, x, T^T|, by restr{d, C){h, rj) = d{h, rj). □ 

Semantics of class table and its approximation chain jij 

I 1 

The semantics of a well formed class table CT, written |CT], is the least upper 
bound of the ascending chain /i G N ^ |Mi?nu] defined as follows. 

fioCm = A(/i, ?7) • -L if m is declared or inherited in C 

Hj+i Cm = \M\ij,j if m is declared as M in C 

/Uj+i Cm = restr{{iij+i Bm),C) if m is inherited in C from B 



Remark 5.6 (On proofs) We give some proofs in considerable detail. To avoid 
repetition, we use the same identifiers as in the relevant semantic definition for 
each case — often different from those in the statement of the result being proved — 
taking care to avoid ambiguity. This saves explicit introduction of the identifiers 

ACM Journal Name, Vol. V, No. N, Month 20YY. 



32 • A. Banerjee and D. Naumann 



or mention of the ranges and scopes of quantification. But it requires the reader to 
keep an eye on the semantic clauses. Often, without remark, we consider only the 
case where the outcome and various intermediate values are non-_L, as the _L cases 

arc straightforward. 

Lemma 5.7 (semantics is well defined and typed) Lot CT bo well formed. 

(1) li C < B then for any T with self ^ domV we have |ifeap ® T, self : C] C 
|ifeapOr,self:B]. 

(2) If r h e : T then |r h e : T] G {Heap ^ [Tj.]. 

(3) If {h,ri) G {Heap^T} and d = [F; C h e : Tl(/i, r/) with ^ _L then {h,d) G 

{Heap^T}. 

(4) If r h 5 then [F h 5] G {MEnv} {Heap ®T\^ {{Heap ® r)^]. 

(5) {CT} is well defined. 

Proof. (1) follows easily from the fact that C <B implies |C] C {B}. 

For (2), inspection of the definitions shows that [F; C h e:T\{h,rj) is in |Tj_]. 
It is property {h,d) G {Heap (8 T|, i.e., (3), that we need explicitly in some proof 
steps. This holds because {h, if) is closed and no expression creates fresh locations. 

Property (4) requires a straightforward but not entirely trivial check that, for any 
/Lt, |F; C \- S\iJi{h, 7]) is in {{Heap (g) T)±j. For example, in the case of method call 
X := e.m(e) we need the fact that /xCm is in |C, pars{m, C), mtype{m, C)] regardless 
of whether m is declared or inherited in C . The store rji is passed to the method 
meaning ^{loctype £)in determined by the type, loctypei, of the target. Note that 
r]i self = i and ^{loctype £)m is from a declaration in loctype i or a, superclass, so rn 
is in its domain by (1). Of course the call aborts if £ = nil. 

For (5), acyclicity of < ensures that the semantics of the class table is well founded 
on inheritance depth. And (1) ensures that the definition /ij+i Cm for an inherited 
method yields a value in the semantic domain lC',pars{m,C),mtype{m,C')l. We 
only take fixpoints for method environments, which form a complete partial order 
with bottom. The fixpoint is well defined because the meaning |M] of a method 
declaration M is a continuous functions of the method environment. This is because 
each [F \- Sj is a continuous function on method environments — which in turn 
depends on the fact that the semantic definitions for commands are continuous in 
their constituent commands and expressions. 

The semantics of object construction commands (new) is mutually dependent 
on the semantics of constructors. This is resolved as follows. 

First, the semantics of constructors is defined by well founded recursion on the 
order <C on classes. For semantics of self : B h constr B : con we use both (a) the 
constructor semantics of self : {super B) h constr{super B) : con and (b) the com- 
mand semantics for constr B. For (a), note that super C <C C by Lemma 4.9. For 
(b), note that if constr C uses new B for other classes B, we have B \Z C hy a, 
condition on well formed class tables; then B <C C by Lemma 4.9. Note that there 
is no dependence on the method environment. 

Finally, for semantics of methods we need all constructors as there is no restriction 
on which objects can be constructed. The semantics of methods is by structural 
recursion on method bodies, using the semantics of constructors. □ 
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Fig. 6. Confinement sclieme for is- 
land j. Dashed boxes are partition 
blocks. Solid lines indicate allowed 

references and dotted lines indicate 
prohibited ones. There is no restric- 
tion within blocks. 



6. CONFINEMENT RAMIFIED 

Our aim is to support reasoning where simulations are specified on a per-island 
basis, where an island consists of a single owner and its reps.^^ This section for- 
malizes a semantic notion of confinement suited to this purpose. In particular, it 
takes into account the limited access to reps allowed for owner subclasses, which is 
discussed further in Sect. 9. 

6.1 Confinement of states 

As discussed in Sect. 3.2 we assume that class names Own and Rep are given, such 
that Own ^ Rep and thus [Own] n fRepJ = {nil}. As an abbreviation, we write 
locs(Ownl, Repl) for locs{Ownl) U locs{Repl). 

We say heaps hi and /12 are disjoint if dom hi fl dom /12 = 0- Let hi * h2 be the 
union of hi and /12 if they are disjoint, and undefined otherwise. 

We shall partition the heap as h = Ch* . . . where Ch contains client objects and 
the rest is partitioned into islands of the form Oh * Rh consisting of a singleton 
heap Oh with an owner object and a heap Rh of its representation objects. In such 
a partition, the heaps Ch, Oh, and Rh need not be closed. An example is Fig. 5 in 
Sect. 3.3; the general scheme is depicted in Fig. 6. Our use of the word "partition" 
is slightly non-standard: we allow the blocks Rhi and Ch to be empty. 

Definition 6.1 (admissible partition) An admissible partition of heap is a 
set of pairwise disjoint heaps Ch, Ohi, Rhi, . . . , Ohk, Rhk, for fc > 0, with 

h= Ch* Ohi * Rhi * ... * Ohk * Rhk 

^■*In particular, this entails describing how a simulation is established by an owner constructor 
acting on a single owner object. As constructors have no parameters, one could define the se- 
mantics in terms of constructors applied to a single object and yielding a small heap. But such 
a constructor will in fact be executed in a larger heap. Suppose (ft, 77) € [J?eop®r], so that 
everything reachable from 77 is already in h. If h' is a heap, not necessarily closed, such that 
h' * h is in [i/eap], then it is immediate from the definitions that (h' * h.r)) is in {Heap (X> F]. For 
any S and p. we have [F h S']/i(fe, r;) = ± iff [F h * h,ri) = ±, as can be shown using the 

fact that [F h e ; T] (/i' * /i, ?7) = |F h e : T] (fc, »;) . (Strictly speaking, this depends on jj, having the 
property; and then one shows that [CT] has the property.) What is not true is the following: if 
{ho, w) = [r h S}fj,{h, rj) then [F h S\fi{h' * h,ri) = {h' * ho,r}o). The reason is that the allocator 
fresh depends on the domain of the entire heap, and we have made no assumptions to relate its 
behavior on h and h' * h. 

We have not checked the details but it seems clear that if (/lOi^^o) = P H S\^i{h' * h,r]) then 
there is h'g such that ho = h' *h'Q and (/iq,»?o) G {Heap F]. Also, for S without method calls and 
satisfying the dependency condition for constructors (Def. 4.4), if ho = [F h 5 : con]/i(/i' * h,i]) 
then there is /iq such that ho = h' * h'^. But to be useful for our purposes this property would 
have to be strengthened to take partitions into a<;count. 
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and for alH (1 < i < k) 

— dom Ohi C locs{Own[) and size{dom Ohi) = 1 

— domRhi C locs{Repi) 

— dom Ch n locs{Ownl, Repi) = 



(owner blocks) 
(rep blocks) 
(client blocks) 



Definition 6.2 (confined heap, confining partition, -/») To say that no ob- 
ject in hi contains a reference to an object in /12, we define 7^ by 

hiT^ h2 <s=^ e dom hi • rng{hi i) n dom /i2 = . 

To say that no object in hi contains a reference to an object in /i2 except via a field 
in f, we define by 

hi 7^^ /i2 <S=> V€ G dom hi • rng{{hi £) [J) n dom /12 = • 

A heap h is confined, written conf h, iff it has a confining partition. A confining 
partition is an admissible partition such that for all j, i with j ^ i we have 

(1) Ch 7^ Rhj (clients do not point to reps) 

(2) Ohj ^ Rhi (owners do not share reps) 

(3) Ohj Rhj where g = dom{dfields{Own)) (reps are private to Own) 

(4) Rhj 9^ Ohi * Rhi (reps are confined to their islands) 

A heap may have several admissible partitions, because there is no inherent order 
on islands and because unreachable reps can be put in any island. The definitions 
and results do not depend on choice of partition. We have not found a workable 
formulation that determines unique partitions. To describe the effect of confined 

commands on partitions we use the following. 

Definition 6.3 (extension of confining partition, <) Define h < ho iS h is 
confined and for any confining partition of h, 



that is an extension in the sense that it satisfies the following: 

— n > fc 

—dom{Ch) C dom{Ch") 
—dom{Ohj) C dom{Oh°) for all j <k 
—dom{Rhj) C dom(Rh^) for all j <k □ 

Confinement of a store depends on the class in which it may occur. For owners 
and reps it depends on the domain of the heap as well. 

Definition 6.4 (confined store, global state) Let /i be a confined heap and r] 
be a store in |r, self : C] for some F. We say r] is confined in h for C iff 

(1) C ^ Rep AC ^ Own rng r] n locs{Repi) = 
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(2) C < Own =J> rng -q fl locs{Repl) C dom{Rhj) 

for some confining partition and j with r/self e dom{Ohj) 

(3) C < Rep rng r] fl Zocs(Ot«nJ,, Repl) C dom{Ohj * 

for some confining partition and j with 77 self G dom{Rhj) 

A global state {h,ri) is confined, written conf C {h,r]), iff is confined and r] is 
confined in h for C. □ 

Apropos the examples in Sect. 2.1, take Rep to be Bool and suppose the sequence 
z := new OBool; w:=z.bad() occurs in a method of some client class. Executed 
in a confined initial state, the state after assignment of a new OBool to z is still 
confined. The assignment to w then yields a state where the heap is confined but 
the client's store is not. 

6.2 Confinement of commands and methods 

A confined command is one that preserves confinement of global states. Because 
command meanings depend on the method environment and expression meanings, 
confinement for those is formalized first. We need to ensure that a method call 
yields a heap confined for the caller. This is achieved using the condition h < ho 
in the following. 

Definition 6.5 (confined method environment) Method environment is con- 
fined, written conf n, if and only if the following holds for all C and m with 

nitype{m, C) defined. Let mtype{m, C) = T-^T and pars{m, C) = x. For all 
{h,r]) G fHeap ^x:T,se\f:C], if conf C {h,i]) and fiCm{h, rj) ± then 

(1) C ^ Rep conf C (/iq, ??) A /i < /iq A d ^ locs{Repl) 

(2) C < Rep ^ conf C{ho,v) ^h<ho 

A{de locs{Ownl, Repl) d G dom{Ohj * Rh^)) 
for some confining partition ho = Ch * Ohi * Rhi . . . 
and j with 77 self e dom{Rhj) 

where {ho, d) = fj.Cm{h,ri). □ 

Condition (1) fails for method bad of the example in Sect. 2.1, regardless of whether 
the return type of bad is taken to be Object or Bool. 

The conditions for confinement of expressions are like those for confined stores 
— after all, a store provides the meaning for the expression x. The conditions are 
somewhat different for confined method environments, because methods are public 
and can be called both by clients and from within an owner island. (In Sect. 9, 
Def. 6.5 is refined to allow module-scoped owner methods to return reps.) Also, 
confinement of commands does not explicitly require heap extension h < ho like 
Def. 6.5 does, because it is a consequence of the other conditions (see Lemma 6.16). 

Definition 6.6 (confined expression) Let C = Fself. Expression T h e:T is 
confined iff for any (/; , )/), if conf C {h, rj) and |r h e : T\{h, rf) ^ 1. then the following 
hold, where = |r h e : T\{h, ry). 

(1) C -^RepKC Own ^ d <^ locs{Rep[) 

(2) C < Own => (d G locs{Repi) ^ d€ dom{Rhj)) 

for some confining partition and j with r/self G dom{Ohj) 
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(3) C < Rep (d G locs{Owni, Repi) ^ d e dom{Ohj * Rhj j) 

for some confining partition and j with 77 self G dom{Rhj) □ 

Definition 6.7 (confined command) Let C = Fself. Command F h 5 is con- 
fined iff 

— conf /J, A conf C {h, 77) A |r h S]^{h, 77) 7^ _L => conf C {ho, for any ^ and any 

{h,T]), where (/?-o,??o) = [rh Sjn{h,ri) 
— if S* is a method cah then it has confined arguments (see below). □ 

Confinement of arguments means that the store rji passed in the semantics of 
method call is confined for the callee. 

Definition 6.8 (confined arguments) Let C = T self. A call T \- x := e.m(e) 

has confined arguments provided the following holds. Suppose U is the static type 
of e and D the static type of e. For any (/i, 77) with conf C {h, if), let 

d= |r h e:C7](/i,77) ^ = |r h e : r;) 771 = [x d, self . 

If £ ^ _L, £ ^ nil, and d 7^ _L then conf {loctype i) {h, rji). 

A super-call T \- x := super.m(e) has confined arguments provided the following 
holds. Suppose U is the static type of e. For any {h, rf) with conf C {h, rf), let 

d= lFhe:C7l(/i,r?) £ = r?self r?i = d, self £] . 

If 7^ _L then conf {superC) {h, 771). □ 

A purely semantic formulation would call class table CT confined just if |CT] 
is a confined method environment. But under simple restrictions, confinement of 
|CT] follows from confinement of method bodies and constructors. Thus we choose 
the following. 

Definition 6.9 (confined class table) Class table CT is confined iff for every 

C and every m with mtype{m, C) = T~>T the following hold. 

(1) If m is declared in C by T m(T x){S} then S and all its constituents are 
confined. 

(2) If the constructor declaration in C is con{ S } then S and all its constituents 

are confined. 

(3) If C < Own then T ^ Rep. 

(4) If m is inherited in Own from some B > Own then T ^ Rep. 

(5) No m is inherited in Rep from any B > Rep. □ 

In Sect. 10 we add module-scoped methods on which condition (3) need not be 
imposed. This condition ensures that owner methods do not return reps, which is 
not ensured by confinement of the method body. Condition (5) is needed because 
confinement of a method inherited from B > Rep depends on the arguments, 
including self, being confined at B where reps are disallowed. Invocation of such 
a method on an object of type Rep (or a subclass) would yield a store with self a 
rep. A more refined restriction is to disallow inheritance into Rep only for methods 
which leak self; see Sect. 12. 
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Example 6.10 Condition (3) precludes the bad method of Sect. 2.1, for both return 
types Object and Bool. Except for this, all examples in Sect. 2 yield confined class 
tables (e.g., a well formed class table is obtained by combining Figs. 2 and 3). 
One way to prove confinement for these examples is to check that they are safe 
according to the static analysis of Sect. 11. For this one uses the desugarings of 
Remark 4.1. □ 

6.3 Properties of confinement 

We need a number of results about confinement. The most important is that the se- 
mantics of a confined class table is a confined method environment (Theorem 6.17). 
This depends on Lemma 6.16 which says that confined commands extend heap par- 
titions, provided that method meanings have this property. 

Lemma 6.11 If T is bool or unit, then every F h e : T is confined. 

Proof. Direct from the definitions: confinement only pertains to locations. □ 

Lemma 6.12 Suppose rngr] n locs{Repi) = and C < B. Then for any h and 
any -q G [F, self: C] we have conf C {h, -q) iff conf B (h, rj). 

Proof. Straightforward. Sec Appendix. □ 

Lemma 6.13 If conf C {h, rj) and h < ho then conf C {ho, rj). 

Proof. Straightforward. Sec Appendix. □ 

Although confining partitions are not unique, a given confining partition of an 
initial state can be extended to one on the final state for any command. This 
is Lemma 6.16 below, which depends on the analogous property for constructors. 
Lemma 6.15. From the proof of the latter, we factor out the induction step as 
a somewhat complicated separate result, Lemma 6.14, because it is also used in 
Sect. 11 to show soundness of the static analysis. Skip on first reading! 

Lemma 6.14 Let be a method environment. Suppose we have the following: 

(1) self: C 'r S is a confined command. 

(2) for any B with an occurrence of new B in 5 we have B \Z C and moreover no 
method calls occur in S. 

(3) for any B with an occurrence of new B in S, and also for B — superC unless 
superC = Object, the following holds for any {h,r]) with conf B {h,T]): 

[self : S h So : con]/i(/i, r;) ^ _L =^ /i < /ii , 

where = constrB and hi — [self: BhSo- con]/Lt(/i, rj). 

Then for any {h, rj) with conf C {h, rj), if [self : C I- S : con]/x(/i, t]) ^ 1. then h < ho 
where ho = [self:C h S •.conjij.{h,r]). 

Proof. Assume (1-3) hold. To show the conclusion for the non-± case, con- 
sider any (/i, 77) with conf C {h, rj) and let hi be as in the semantics of 5 as a 
constructor. If superC — Object then hi = h and thus h < hi. Otherwise, 
hi = [self: superC h constr{superC) : conj ij.{h,r]) and h < hi holds by hypothesis 
(3). Now by semantics, ho = |self : C h Slfj,{hi,r]). 
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To show that h ^ ho, we can argue by induction on the structure of S. Note 
that S has no method calls, by hypothesis (2), so n is not relevant. Moreover, for 
any object construction the result holds by hypothesis (3). We omit the rest of 
the argument, which uses hypothesis (1): it is exactly the same as in the proof 
of Lemma 6.16 below, except for appealing to hypothesis (2) for the case of new, 
where that proof appeals to Lemma 6.15. □ 

Lemma 6.15 (extension by constructors) Suppose self:C h constrC is con- 
fined, for all C. Then for any (h, rj) with conf C {h, rj) wc have 

[self : C h S : con}iJ,{h, r]) ^ ± ^ h < ho where ho = [self : C h S : con]/i(/i, rj) . 

Proof. This is exactly the conclusion of Lemma 6.14. We prove it by well 
founded induction on C, using ^ which is well founded by Lemma 4.9(1). For 
any C and S, it suffices to show that the hypotheses (1-3) of Lemma 6.14 hold 
for classes smaller than C with respect to -C. First, (1) holds by hypothesis of the 
present Lemma. By well formedness of the class table, there arc no method calls 
in constrC, and moreover if new B occurs in S then B C C; this is hypothesis (2). 
Now from Lemma 4.9 and well formedness of the class table we have that B -C C 
for every new B that occurs in S and also super C -C C. Thus by the induction 
hypothesis we have (3). □ 

Lemma 6.16 (extension by commands) Suppose F h 5 is confined and all its 
constituents are confined. Suppose moreover that self: B h constrB is confined, for 
all B. Let C = F self. For any /i, h, r] with conf /j. and conf C {h, rf) 

[F h S\,i{h, r])y^±^ h<ho where {ho, -) = [F h S^h, rj) . 

Proof. By structural induction on S. Let C = Fself. We assume a confining 
partition h = Ch * Oh\ * Rh\ * ... * Ohk * Rhk is given [k may be 0, i.e., there 
need not be any islands). We show how to construct confining partition ho = 
Ch° * Ohi * Rhi * .. . that extends the given one. 

Case F h ei.f := 62- From [F h ci.f := e-2\iirjh ^ _L and Lemma 5.7(3) we have 
that t € domh where £ = |F h ei :C](/i, r/). By semantics, ho = [h \ £1-^ [h£ \ f 1-^ 
d\]. We partition ho using the given partition for h. That is, the domain for each 
block of the updated heap ho is the same as the corresponding block for h. Clearly 
this extends the partition for h. To show that this partition is confining for ho, it 
suffices to show that the update of h£f to d satisfies the confinement property for 
£. We argue by cases on loctype £ 

— loctype £ ^ Own A loctype £ ^ Rep. Then Def. 6.2(1) applies; it requires d ^ 
locs{Rep[). By typing, loctype £ < C, so C ^ Own AC ^ Rep. Thus by confine- 
ment of ei (a constituent of ei.f :— 62 and therefore confined by hypothesis), we 
have by Def. 6.6(1) that d ^ loc.s{Repl). 

— loctype £ < Own. Def. 6.2(2) and (3) apply here. Letting j be the index of the 
island with {£} = dom{Oh^) = dom{Ohj), we must show both Oh'j 7^ (for 
i 7^ .)) and Oh" i^'^ Rh"^. By typing, loctype £ < C , so C < Own or Own < C by 
the tree property of <. We argue by cases on C. 

— Own < C. By Own ^ Rep, we have C ^ Rep so confinement of 62 at C yields 
d ^ locs{Repi). Thus Oh] 7^ and Oh] Rh]. 
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— C < Own. By confinement of 62, ii d & locs{Repl) then d S dom{Rhj) 
so Oh!^ -/* Rhi fov i ^ j. If C = Own then, by the typing rule for field 
update, / is in the private fields g of Own, so the update cannot violate 
0/i° Rh°. If C < Own then d ^ locs{Repi) because if d is a rep then there 
would be no confining partition, contradicting confinement of ho which holds 
by confinement of 5*. 

— loctype£ < Rep. Def. 6.2(4) applies in this case: we need to show 7^ 
Oh^ * Rh^ where i j and j is the island for £ in the partition of h. By typing, 
loctype I < C, hence C < Rep or Rep < C. But if Rep < C then C ^ Own and 
the confinement condition for ei (Def. 6.6(1)) at C contradicts loctype £ < Rep, 
so we have C < Rep. Now confinement of 62 yields d € locs{Ownl, Repl) ^ d € 
dom{Ohj * Rhj). This proves ii/i^ ^ Oh^ * Rh^, because dom{Ohj * Rhj) = 
dom{Oh° * Rh°). 

Case r h- x:=new B. In the semantic definition, hi = [h \ £ 1-^ [fields B 1— » 
defaults]] where £ = fresh{B,h). Define Bh = [£ 1-^ [fields B 1-^ defaults]] so 
hi = h* Bh. Let r]i = [self 1-^ £]. Next, we argue that h < hi and conf B {hi,r]i). 
Because; h is closed, £ is not in the range of any object state in h. To construct an 
extending partition it suffices to deal with the new object, as its addition cannot 
violate confinement of existing objects. (This would not be the case if dangling 
pointers were allowed, unless further restrictions are imposed on fresh.) We define 
the extension and argue by cases on B. 

— B ^ Own A B Rep. For a confining partition of hi we extend that for h 
by defining Ch'^ = Ch * Bh and using the given partition of owner islands. 
Because defaults contains no locations, this is a confining partition and we have 

conf B {hi,rii). 

— B < Own. We extend the partition by adding an island Oh^_^i * Rh^_^_i with 
Ohl^i = Bh and Rhl^i = 0. This is a confining partition because defaults has 
no locations and we have conf B {hi,r]i) because rngrji has no reps. 

— B < Rep. We can obtain a confining extension by adding Bh to any of the 
Rhi, as defaults has no locations. As rngrji = {£}, we have conf B {hi,rji) by 
definition. 

This concludes the argument for h < hi and conf B {hi,r]i). These let us apply 
Lemma 6.15 for constrB to get hi < ho where ho = [self : B h constrB : con|/u(/ii, 7/1). 
Then ft. <! ft-o by transitivity of <. 

Case F h x := e.m(e). As e.m(e) is confined, its argument values are confined. 
Thus we can obtain the result directly from confinement of jj., which explicitly 
stipulates h < ho, and semantics of e.m{e). 

The remaining cases are straightforward. See Appendix. □ 



^In this case we have C < Own or C < Rep, as otherwise the command would not be confined. 
To show conf C {hi, if) in this case, wc would liavc to wc put I in B,hj, clioosing j such tliat »;self 
is in the jth island. But we are only showing the extension of the partition for this lemma. For 
soundness of the static analysis, we do have to show conf C {hi,Ti). 
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Theorem 6.17 Suppose that CT is confined. Then the semantics \CT\ is con- 
fined, as is each fij in the approximation chain used to define it. 

The proof uses fixpoint induction, which is only sound for inclusive predicates, 
i.e., those closed under limits of ascending chains. For confinement of method 
environments the definition is given pointwise, ultimately unfolding to the property 
that the semantics of each method body preserves confinement. This definition, as 
well as the one for the simulation TZ later, is in the usual form of logical relations. 
By the structure of the definition, and continuity of the semantics, the property is 
an inclusive predicate. 

Proof. Confinement of \CT\ follows by fixpoint induction from confinement 

of fii for all i, which we show by induction on i. The base case holds because 
jjLoCm = A(/i, 77) • _L, for any C, m, and this is confined by definition. 

For the induction step, suppose conf fn, to show conf /ij+i. Consider an arbitrary 
TO. We argue for all C with m,type{m, C) defined, by induction on method depth 
(Def. 4.7) of C for to. The base case is C such that depth{m, C) = 0. In this case, 
CT{C) has a declaration 

Tm{Tx){S} . 

Suppose conf C {h,r]) and m+iCm{h,r]) ^ _L. Let {ho,d) = iJ,i+iCm{h,r]), which 
by definition of m+i is obtained as 

?7i = [r; I re5u\ti-^ default] 
{ho,Vo) = :T,self:C, result :rhS]/Xi(/i,77i) 
d = 770 result 

Default values do not violate confinement so conf C {h,rii). As CT is confined, 
S and its constituents are confined. By Lemma 6.16 we have h < ho, so by 
Lemma 6.13 we have conf C (/iq, rj). To show the confinement condition for fii^iCm 
it remains to deal with the result value d. We have conf C {ho, rjo) by confinement 
of S. We argue by cases on C. 

— C ^ Own AC Rep. We need d ^ locs{Repi), for Def. 6.5(1), and this follows 

from conf C (/io,?7o) by Def. 6.4(1). 
— C < Own. We need d ^ locs{Repi), and since by typing we have d € lT±j, 

Def. 6.9(3) ensures T ^ Rep and hence d ^ locs{Repsi). (Note that semantic 

confinement of % at C < Own allows reps, so it is not enough for this case). 
— C < Rep. Then we need d S locs{Ownl, Repl) to imply that d is in the domain 

of Ohj * Rhj for some partition and island j such that r^self G dom{Ohj * Rhj). 

This follows from confC{ho,Vo) by Def. 6.4(3). 

This concludes the base case of the induction on depth. 

For the induction step, i.e., depth{m^ C) > 0, to, may be inherited or declared in 
C. If it is declared in C the argument is the same as for the case depth{m, C) = 
above. Suppose m is inherited in C from B. Now /Xj+iCto = restr{{iJ,i+iBm),C) 
by definition of fii+i. By induction on depth fii+iBm satisfies the confinement 



i^See Ploto's notes. 
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condition for to, 5. To show the condition for jJi^iCm, suppose conf C {h,r]). We 
claim that conf B {h, rj). Using the claim, we argue as follows. If in+\Bm{h, rj) ^ ±, 
let {ho, d) = fii^iBm{h, rj). By induction on depth wc have conf B {ho, rj) and h < 
ho- By Lemma 6.13 we obtain conf C {ho,r]). It remains to show the confinement 
condition for d and to prove the claim. We argue by cases on C. 

In the following non-rep cases, the claim holds by Lemma 6.12. To apply the 
Lemma, wc just need to show that rng rj fl locs{Rep[) = 0. 

— C ^ Own AC ^ Rep. In this case, we have rng ■r]nlocs{Repl) = by confinement 

of 7/ at C, Def. 6.4(1). 
— C < Own < B. Then Own inherits to from B > Own, so by confinement of the 

class table, Def. 6.9(4), we have T ^ Rep. Also, Own ^ Rep, so by Lemma 5.4 

we have no reps in rngr}. 

In the preceding cases, the condition imposed on d by Def. 6.5(1) for class C is 

d ^ locs(Repl). But this same condition is imposed for class B, and it holds by 
induction on depth. For the remaining cases we prove the claim conf B {h, rj) as 
follows. 

— C < B < Own. Both B and C impose the same condition (Def. 6.4(2)). 

— C < B < Rep. Both C and B impose the same conditions on r] (Def. 6.4(3)). 

In these two cases the requirement for d at C, Def. 6.5(2) or (1), is the same as for 
B, so it holds by induction on depth. 

The case C < Rep < B cannot occur in a confined class table. If m is inherited in 
C < Rep from B then it is inherited in Rep from B, and this is explicitly disallowed 
in Def. 6.9(5). □ 

7. FIRST ABSTRACTION THEOREM 

This section formulates and proves the central result of the paper. First, we make 
precise the idea of comparing two class tables that differ only in their implemen- 
tation of class Own. Then we define basic coupling: a relation between single 
instances of class Own for the two implementations. This induces the coupling 
relations for other data types, for heaps containing multiple instances of Own, and 
for method meanings. Related method meanings have the simulation property: 
if initial states are coupled, then so are outcomes. The main theorem says that if 
methods of Own have the simulation property, then so do all methods of all classes. 

7.1 Comparing class tables 

We compare two implementations of a designated class Own. They can have 
completely different declarations, so long as methods of the same signatures are 
present — declared or inherited — in both. They can use diflferent reps, distin- 
guished by class name Rep for one implementation and Rep' for the other. We 
allow Rep = Rep'. For simplicity, we assume that both Rep and Rep' are in each 
of the two compared class tables. 

An alternative formulation would consider different declarations of Own together with associated 
class tables in which Rep or Rep' but not both are declared. But these could be combined into 
class tables fitting our formulation. 
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Definition 7.1 (comparable class tables, non-rep classes) Suppose class names 
Own, Rep, Rep' are given, such that Own ^ Rep and Own ^ Rep'. We say C is a 
non-rep class iff C ^ Rep and C ^ Rep'. Well formed class tables CT and CT' are 
comparable provided the following hold. 

(1) CT and CT' are identical except for their values on Own. (In particular, 
CT{Rep) = CT'(Rep) and CT{Rep') = CT'{Rep').) 

We write h, h' for the typing relations determined by CT, CT' respectively, 
and similarly for the auxiliary functions, such as mtype, mtype! . We also write 
|— J, |— J' for the respective semantics and assume that the same allocator, fresh, 
is used for both |— ] and [— ]'. 

(2) superOwn = super' Own. 

(3) For any m, either mtype{m,Own) and mtype' {m, Own) are both undefined or 
both are defined and equal. □ 

Example 7.2 Let CT be given by Figs. 2 and 3. Let CT' be given by Figs. 4 and 
3 together with Observer from Fig. 2. These are comparable. □ 

Instead of condition (3), one could require that CT{Own) and CT'{Own) de- 
clare the same methods. But that would disallow some situations that occur in 
practice. Suppose class C extends B by adding a method m implemented using 
calls to methods inherited from B. This might be the easiest way to achieve de- 
sired functionality for m, but there could be an alternative data structure that is 
more efficient for m and for the methods of B. An alternative implementation of 
C could add that data structure and override the methods of B to use it. One can 
argue that the program is poorly designed, e.g., because space for attributes of B is 
wasted in C objects. Better designs are possible in languages with interface types 
separate from classes. Nonetheless, such examples do arise in practice. Allowing 
them complicates the proof of Theorem 7.20 but none of the other results. The 
main consequence we need from condition (3) is the following. 

Lemma 7.3 If mtype{m, C) is defined then depth{m, C) = depth' {m, C). 

Proof. Straightforward. See Appendix. □ 

One can imagine a theory in which an owner subclass C < Own has different 
declarations in CT and CT'. But we are concerned with an abstraction provided 
by a single class rather than by a collection of classes, so CT{C) — CT'{C) here. 
In Sect. 7.3 we impose a restriction on owner subclasses that is needed for the first 
abstraction theorem. The issue is explored in Sect. 8 and the restriction lifted in 
Sect. 10. 

7.2 Coupling relations and simulation 

The definitions are organized as follows. A basic coupling i? is a suitable relation 
on islands. This induces a family of coupling relations, TZ 9 for each 9. Then comes 
the definition of simulation, a coupling that is preserved by all methods of Own 
and established by the constructor. 

Definition 7.4 (basic coupling) Given comparable class tables, a basic coupling 
is a binary relation R on heaps — not necessarily closed — such that the following 
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holds: For any h, h', \i Rhh' then there is a location £ with loctype I < Own and 
partitions h= Oh* Rh and h' = Oh' * Rh' such that 

(1) dom Oh = {£} = dam Oh' 

(2) dom{Rh) C locs{Rep[) and dom{Rh') C locs{Rep' [) 

(3) Mf = h'ef for all (/ : T) G dom{fields{loctype i)) with f and f ^ g', where 
5 = dom{dfields{Own)) and 5' = dom{dfields' (Own)) □ 

Example 7.7 below shows why we allow R to act on heaps that are not closed. 

Although R is unconstrained for the private fields and reps, condition (3) deter- 
mines it for fields of subclasses of Own. Once we have defined the induced relation 
TZ, item (3) will be equivalent to the condition TZ {type{f, loctype I)) {Mf) (h'lf). 

Because CT and CT' arc well formed, the declared field names ^ and Tj' do not 
occur as fields of subclasses or superclasses of Own. In (3), / ranges over fields of 
both subclasses and superclasses; except for g and g', we have fields C = fields' C 
for all C. The typing relations h and h' are also the same except for class Own. 

Example 7.5 Sect. 2.2 discusses this coupling relation: 

(eg = nil = o'.g) V (o.g ^ nil ^ o'.g A o.g.f = -1(0'. g.f)) . 

For this example we take both Rep and Rep' to be Bool, and Own to be OBool. 
The displayed formula can be interpreted as relation R which relates h to h' just if 
either h=\l\^\gy-^ niV^ and h! = \l\ ^\g^ niV^ or else 

/i = [£i [c, H^. ^2], ^2 1-^ [/ 1-^ rf]] and h! = \h ^\g^ ^3], ^3 ^ ^d]] 

for some boolean d and locations li in Iocs OBool and (■:>, in Iocs Bool. We assume 
that the class table contains only Bool, OBool, and some client classes. If OBool 
had subclasses, the relation on their fields would be determined by condition (3) 
above. □ 

Example 7.6 Sect. 3.1 uses the formula o.g = o'.g A o.g mod 2 = 0. This can be 
interpreted as the basic coupling R that relates h to h! just if there is some I with 
loctype I < k,h and h' have domain {(.}, and hig = h'ig = 2xmfov some integer 
m > 0. □ 

Example 7.7 The Observer examples show why we allow R to relate non-closed 
heaps. Consider the version in Fig. 2. Here Rep is Node, Own is Observable, and 
there is a client class Observer. Fig. 5 illustrates two instances of this simple data 
structure. Fig. 4 gives code for an alternative version which uses an extra node as 
sentinel for the list. The sentinel does not point to an Observer. Fig. 7 depicts a 
corresponding pair of heaps for the two alternatives, using arrows without desti- 
nation objects to indicate dangling pointers. Upon initialization of an Observable, 
there arc no installed Observers, so for the version of Fig. 2 we should have fst = nil. 
But in the alternative version, this should correspond to snt holding the location of 
a Node with ob = nil. This is established by the constructor in Fig. 4. An attempt 
at formalizing the correspondence is as follows: 

(o.fst = nil = o'.snt.nxt) V (o.fst ^ nil ^ o'.snt.nxt A a(o.fst) = a'(o'.snt.nxt) 

where a, a' are functions that yield the list of locations in the ob fields of successive 
nodes. But how should this formula be interpreted if, say, o'.snt = nil or there is 
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Fig. 7. Basic coupling example. La- 
bels indicate locations as described in 
Example 7.7. Note dangling pointers 
£2 and £4 and sentinel node £q. 
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sharing such as a c;haiii with cyclic tail? Separation logic [Reynolds 2001] offers a 
precise way to formulate such definitions but its development is at an early stage. 
We simply sketch the coupling in terms of semantics: Rhh' iff either h and h' have 
the form 

ft, = [ ^ 1-^ [fst 1-^ nil]] 

h' = [i 1-^ [snt 1-^ £'q], £'q i-^ [ob i— > nil, nxt 1-^ nil]] 
or they have the form 

/i [fst H-> £1], ii [ob H^- £2, nxt H- > £3], i'3 H^- [ob i— > .£4, nxt ...],.. .] 

h' =[iH^ [snt £0], £0 H^- [ob H^- m/, nxt f'J, 

.£'1 [ob i-> .^2, nxt ^3], !->■ [oh <->■ £4, nxt i-> . . .], . . .] 

for some locations £ in Zocs(Observable), £i,£3,... in Zocs(Node), £'f^,£[,£'^, . . . in 
Zocs(Node2), and £2, £4, ... in /ocs(ObserverJ,). 

Note that the owners are at the same location, £, as are the referenced client 

objects at ^2, ^4, No correspondence is required between locations £1, ^3, . . . and 

^^,,^1,^^,... of reps. □ 

A basic coupling induces a relation TZ Heap on arbitrary heaps by requiring that 
they have confining partitions such that islands can be put in correspondence so that 
pairs are related by R. The formal definition uses the induced relation TZ {state C) 
for object states of non-rep classes C ^ Own, and this in turn is defined in terms 
oi TZ C for non-rep classes C ^ Own. For uniformity, we give the definition of 
TZ for all 9, but forcing the case for 9 = state Own to be false, as the compared 
states have different fields. Aside from the ramifications of heap confinement, the 
definition is induced in the standard way for logical relations. 

Definition 7.8 (coupling relation, TZ 9) In the context of a basic coupling with 
given relation R, we define for each 9 a relation TZ9 ^19} x 19}' as follows. 

For heaps h, h', we define TZ Heap h h' iff there are confining partitions of h, h', 
with the same number n of owner islands, such that 

—R {Ohi * Rhi) {Oh\ * Rh'i) for all i in l..n 
— dom{Ch) = dom{Ch') 

—TZ {state {loctype £)) {h£) {h'£) for all £ G dom{Ch) 
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For other categories we define 719 a^s follows. 

7^ bool dd' ^ d = d' 

n unit dd' ^ d = d' 

TIC dd' ^ d = d' 

TZTr]r]' \/x e dom T»TZ {Tx) {rjx) (rj'x) 

n {state C) s s' <^ 

C ^ Own A V/ G domifieldsC) • 7^ {type{f, C)) (s /) (s' /) 
n {0±) a a' {a = J- = a') W {a ^ J- ^ a' ATI 9 a a') 

TZ {Heap T) {h, r?) {h', r)') ^ U Heap hh' AUTrjr]' 
TZ {Heap (g) T) {h, d) {h', d') ^ Tl Heap hh' MIT dd' 

n {C X, T^T) dd' O y{h, r?) e (Heap ® T] , {h', i) € {Heap Tf • 

n {Heap r) {h, 7?) {h', r?') A conf C {h, rj) A conf C {h', r]') 

Tl {Heap T)x {d{h, r])) {d'{h', r?')) 
where T=[x\-^T, self h^- C] 
TZ MEnv iJ, fi' VC, m • (C is non-rep) A {mtype{m, C) is defined) 

7^ (C, pars{m, C), mtype{m, C)) (/x C m) (/x' C m) 

The gist of the abstraction theorem is that if methods of Own are related by TZ 
then all methods are. We can now express this conclusion as TZ MEnv \CT\ \CT'\' . 
To express the antecedent, note that the relation applicable to a method m of Own 
is TZ (Own.x.T^T) where mtype{m, Own) = T^T and pars{m. Own) = x. The 
definition of TZ {C,x,T^T) quantifies over confined initial states but does not 
require confinement of outcomes. The antecedent will also take into account that 
methods may be declared or inherited. 

Although the definition is technically intricate, the core idea is the extension of a 
basic coupling, for a single owner instance, to a heap containing potentially many 
owners. This idea is given straightforward expression using heap partitions. By 
contrast, sharing of representations between owners would require a more compli- 
cated form of extension (see Sect. 12). 

We aim to define per-instanee simulations, and in particular the establishment of 
such a relation by a constructor of class Own on a single island. But to formulate 
this semantically we describe the constructor's action on a heap in which other 
islands may be present. The reason is that there is not an easy way to connect a 
constructor's action on a small heap with its action on a larger one (see Footnote 14). 

Definition 7.9 (simulation) A simulation is a coupling TZ such that the following 
hold. 

(1) (constructors of Own establish TZ) For any i G locs{Ownl), any h, h' with 



-"^^One might think that TZ Heap could be defined in terms of admissible partitions without the 
assumption of confinement. But because partitions are not unique this leads to difficulties; a heap 
could be confined with respect to one partition but related with respect to another. 
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TZ Heap h h' , and any /x, /i', let 

hi = [h\ £i-^\fields{loctype £) i-^ defaults]] 

h[ = [h' I l'^[fields{loctypel') ^ defaults']] 

ho = [self: (loctype £) h constr{loctype £) : con]/i(ft,i, [self £]) 

h'o = [self: {loctype £) h' constr{loctype £) ■.conji^'{h[, [self i-^- £]) 

Then R Iiq h'^. 

(2) (methods of Own preserve 7^) Let ^ G N ^ {MEnvj (rasp, /x' £ N ^ [MEnvf ) 
be the approximation chain in the definition of |CT] (resp. |CT']'). For every 
m with m,type{m, Own) defined, the foUowing impfications hold for every i, 
where x — pars{m, Own) and T~^T — mtype{m, Own). 

(a) n MEnv m /i'^ ^ Tl {Own,x,T^T) ([Ml^i) 

if TO has declaration M in CT{Own) and M' in CT' {Own) 

(b) 7e MEnv fii fi'^ :=> U {Own,x,T-^T) (|M]^i) (restr(|MB]V-, Own)) 

if TO has declaration M in CT{Own) and is inherited from B in CT' {Own), 
with Mb the declaration of to in i? 

(c) n MEnv fii =^ 7^ (Ow;n,x,T^r) {restr{lMB]fM,Own)) (|M']VD 

if TO has declaration M' in CT' {Own) and is inherited from i? in CT{Own), 
with Mb the declaration of to in B 

In the case where constructors in Own and its subclasses are skip, condition (1) 
simply says that the default values are related. Note that it also precludes abort- 
ing constructors, as R applies to heaps but not to _L; this is convenient but not 
necessary. 

The following properties are straightforward consequences of the definition. 

Lemma 7.10 For all h, h' and all locations £ ^ locs{Repl, Rep'i), if TZ Heap h h' 
then £ G domh <^ £ e domh'. □ 

Lemma 7.11 [Tj = |T]' for all T, and iTj = [F]' for all F. □ 

Lemma 7.12 For any data type T, TZT is the identity relation on |T] and TZ T± 

is the identity relation on |Ti]. □ 

Lemina 7 A3 liU <T and TZUdJ then TZT dJ. □ 
7.3 Restricting reps in owner subclasses 

The preceding properties express a strong connection between locations for related 
heaps. To ensure that this connection is preserved by object construction, we shall 

assume the allocator is parametric. But it is not reasonable to require that related 
heaps have the same rep locations, so parametricity cannot be exploited for reps. 
As a result, the present form of simulation is not adequate for construction of reps in 
subclasses of Own, although such construction is allowed by confinement. The first 
abstraction theorem depends on an assumption expressed in the following terms. 

Definition 7.14 (new rep in sub-owner) We say CT has a new rep in a sub- 
owner if, for some B < Rep or B < Rep', an object construction new B occurs in 
some method declaration in a class C < Own. 
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If CT has no new reps in sub-owners then neither does a comparable CT' . The 
examples in Sects. 2 and 3 have no new reps in sub-owners; examples which do are 
given in Sect. 8. 

In the rest of Sect. 7 we make the following assumption. It is used in the proof 
of Lemma 7.23 on which the first abstraction theorem depends. For the second 
abstraction theorem the second sentence will be dropped. 

Assumption 7.15 First, CT and CT' arc confined class tables for which a simu- 
lation TZ is given. Second, CT has no new reps in sub-owners and the allocator is 
parametric in the sense of Def. 5.1. 

7.4 Identity extension 

A typical formulation of identity extension is that TZT is the identity on any type 
T for which it is the identity on all primitive types b that occur in T. The reason is 
that no value of type b can occur in a value of type T if 6 does not occur in T but 
this fails with extensible records and structural subtyping, and with procedures that 
may have global variables [Naumann 2002]. It can be made to work using name- 
based (declaration) subclassing [Cavalcanti and Naumann 2002]: in the context of 
a complete class table, one can consider the classes that have no attributes with 
subclasses in which b occurs. For our purposes here it is enough to deal with the 
heap. 

In our language, TZ T in the identity for every data type T (Lemma 7.12), but 
that is only because the interesting data is in the heap — which is not typed at 
all.^^ In general, Istate OwnJ ^ \state Own]' and TZ{state Own) is not the identity. 
Related heaps can contain owner objects with different states that may point to 
completely different rep objects. But consider executing a method on an object o 
from whose fields no Own objects are reachable, i.e., Own objects are not part of 
the representation of o. The resulting heap may contain Own objects that were 
assigned to local variables, but if the method is confined then those objects are 
unreachable in the final state. 

Definition 7.16 (garbage collection, Own-free) For a set or list d of values, 
define the heap gc{d, h) to be the restriction of h to cells reachable from d. For 
{h,rj) e \Heap^V\, define collect{h,r]) = {gc{rngr],h),r]). Extend collect to 

{{Heap (g) r)_L] by collect ± = ±. 

Say h is Own-free just if domh fl locs{Ownl) = and rj is Own-free just if 
rng rj locs{Owni) = 0. □ 

Lemma 7.17 (identity extension) Suppose TZ {Heap (g) F) {h,rj) {h',f]') and 
Fself is non-rep. Let {h,r]) and {h' ,r]') be confined at Fself. If collect{h,r]) and 
collect{h' , T]') are Own-free then collect{h,r]) = collect{h',T]'). 

Proof. In confined heaps, reps are only reachable from owners. Now the argu- 
ment is a straightforward induction using the definition of TZ. □ 



^Nor would we want to impose a typing system on the lieap, as it would likely preclude unbounded 
data structures [Grossman et al. 2000]. 
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Lemma 7.18 For any TZ given by Def. 7.8 from a basic coupling, if /i e \Heap\ 
is Own-free then TZ Heap h h. If, in addition, {h, rf) e {Heap (8> F] and rng r] is 
Own-free then TZ {Heap (8> F) {h, rj) {h, rf). 

Proof. If h is Ow;n-frcc and confined then it has no reps; its admissible partition 
is a single block, the clients. For such a heap it is immediate from the definition 
of IZ that IZ Heap h h. If rng T is Own-free then TZT r] r] is also direct from the 
definition. □ 

7.5 Abstraction theorem 

The main theorem says that if methods of Otvn preserve the coupling relation then 
so do all methods.^" The proof depends on lemmas for constructors and commands. 
These are given following the theorem. The other main ingredient for the proof is 
the following connection between TZ and the semantics of inherited methods. 

Lemma 7.19 Suppose C and all class names in T are non-rep, and B < C. If 
TZ {C,x,T-^T) d d' then TZ {B,x,T^T) {restr{d,B)) {restr{d' , B)) where restr is 
the restriction to global states of B (see Def. 5.5). 

Proof. Straightforward, using Lemma 5.7(1). See Appendix. □ 

Theorem 7.20 (abstraction) TZ MEnv ICTj {CT'}'. 

Proof. We show that the relation holds for each step in the approximation chain 
in the semantics of class tables. That is, we show by induction on i that 

TZ MEnv m Hi for every i e N . 

The result TZ MEnv |OTj fOT']' then follows by fixpoint induction, as (CTj and 
|CT']' are defined to be the fixpoints of these ascending chains. Admissibility of 
fixpoint induction is discussed preceding the proof of Theorem 6.17. 

For the base case, we have TZ {C, pars{m, C), mtype{m, C)) {no C m) (/Xq C m) for 
every m, C because \{h, ??) • -L relates to itself. 

For the induction step, suppose 

TZ MEnv fi^ . (*) 

We must show TZ MEnv /Xj+i Mi+i, that is, for every non-rep C and every m with 
mtype{m, C) defined: 

TZ (O, X, T^T) (/x,+i C m) C m) {]) 

where x = pars{m,, C) and T^T — mtype{m, C). For arbitrary m we show (f) for 
all C with ni,type{m,C) defined, using a secondary induction on depth{m,C). We 
have depth' [m^C) — depth{m,C) (Lemma 7.3). 

The base case is the unique C with depth{m, C) = 0; here m is declared in both 
CT{C) and CT'{C). We go by cases on C. If O = Own, we get (f) from the 



^"Readers familiar with Reynolds [1984] may expect that, as our language has fixpoints, the result 
only holds for couplings that arc ±-strict and join-complctc. But our basic couplings have this 
property, trivially, because heaps are ordered by equality. The induced coupling is strict and 
join-complete by construction. 
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assumption that 7^ is a simulation. In detail: Using (*) and Def. 7.9(2a) we get 

n{Own,x,T^T){\M}^Jii) {{M'l' ^^'^) , 

whence (f) by definition of /i^+i and jJL[j_i. The other case is C a non-rep class 
different from Own. Then by Def. 7.1(1) of comparable class tables we have 
CT{C) = CT'{C) and in particular both class tables have the same declaration 

Tm(Tx) {S} . 

To show (t), suppose conf C {h,r]), conf C {h' ,r]'), TZ Heap h h', and TZT rj rj', 
where F = x:T,5e\f:C. Then by Lemma 7.23 below, using TZ MEnv fXi /z-, the 
results from S are related. That is, either |r h S}^i^{h, ??) = _L = |r h' S']V-(^'> v') 
or neither is _L. In the latter case, {ho,r]o) is related to (/io,?7o) where (/io,%) = 
|ri- S'lAii(/i,r?) and ih'g,?]'^) = |r h' 5']V-(/i', ^')- Then, by definition of TIT, 
TZT i]o i]q imphes TZ T {rjo result) {tj'q result). Thus (f) holds by definition of /ij+i 
and /^i+i- This concludes the base case of the secondary induction. 

For the induction step, suppose depth{m,C) > 0. By induction on depth we 
have, by definition of depth, 

TZ (C, X, T^T) {ni+i {super C) m) (/x^+i {super C) m) . (t) 

If m is declared in both CT{C) and CT'{C) then the argument is the same as in the 
base case of the secondary induction. If m is inherited in both CT{C) and CT'{C) 
then {X) follows from {]) because the semantics defines /ij+i Cm by restriction 
from {n[j^i {superC) m) and restriction preserves simulation. (This is Lemma 7.19, 
which is applicable because if i? > Own and m is inherited in Own from B then 
T ^ Rep and T ^ Rep' by confinement of CT, CT', Def. 6.9(4).) The remaining 
possibility is that m is declared in CT{C) and inherited in CT'{C) from some B 
(or the other way around). Then C = Own, by comparability of CT and CT' . 
Using Def. 7.9(2b) and (*) we get 

TZ {Own,x,T^T) ([M]/ii) {restr{lMBM,Own)) 

and thus (f) by definition of /ij+i and /U^+i- □ 

Lemma 7.21 (establishment by constructors) Let and /x' be any method 

environments. Then the following holds for any non-rep class C ^ Own. 

For ah {h, £) € [Heap C] and {h',e') G (Heap (g) C], if conf C {h, rji), conf C {h' , r/i) 
and TZ {Heap (8> C) {h,£) {h' ,(.') then TZ Heap^ ho h'^, where 

Tji = [self ^ I] ho = [self: C h constrC : con]/i(/i, 

r/i = [self ^ e\ h'o = [self : C h' constrC: con]^'(/i', r/i) 

Proof. By well founded induction on C with respect to <C. Suppose conf C {h, rj), 
confC{h',r]'), and TZ {Heap ® C) {h,£) {h' ,('). Let hi = [self : C h 5 : con]/Li(/i, 77) 
be as in the semantics of S' as a constructor, and similarly for h'^. If super C = 
Object then hi = h and thus TZ Heap hi h'^ by hypothesis. Otherwise, hi = 
{self : super C \- constr{superC) ■.conjiJ,{h,ri) and we get TZ Heap hi h'l by the in- 
duction hypothesis noting that superC <C C by Lemma 4.9. It follows that 
TZ{Heap®C) {hi, I) {h'i,l'). 

It remains to show TZ Heapj_ ho h'^, where /io,^o are obtained by applying the 
command semantics of constrC to {hi,i) and {h'i,£). This holds because, taking S 
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to be constrC in the claim below, we get TZ {Heap ® C)_l [ho, rjo) {h^, rj^) and thus 
either both outcomes are ± or 7^ Heap ho Hq. 

Claim: For all self:C h S such that S has no method calls and every new B 
in S has B \Z C, for all {h,r]) and {h',ri'), if conf C {h,ri), conf C {h' ,r]'), and 
n {Heap r) {h,r]) {h',r]') then 

n {Heap r)i ([r h Sli,{h,v)) (ir h' 5lV(/i',V)) • 

Proof of the claim is by induction on the structure of S. Note that by hypothesis 
S has no method calls, so ji is not relevant. The argument is exactly the same as 
in the proof of Lemma 7.23 below, except for the case of new. In the proof of 
Lemma 7.23, the case of new appeals to the present Lemma for constructors. To 
prove the claim for this case, the argument is the same except for appealing to the 
main induction hypothesis; this is sound because the claim includes the hypothesis 
that if new B occurs in S then B \z C and thus B ^ C by Lemma 4.9. □ 

Lemma 7.22 (preservation by expressions) For any non-rep class C Own 
and any constituent expression F h e : T of a method declared in C, the follow- 
ing holds: For all {h,r]) € [iJeap ® F] and {h',r]') e |/feap F]', if 7^ {Heap 
F) {h,r]) {h',r]') then 

7^ (T^) ([F h e : Tj{h, r?)) ([F h' e : T]'(/i', ry')) • 

Proof. By induction on the derivation of F h e : T. For each case of e, we give 
an argument assuming that F, C, T, r], t]', h, h' satisfy the hypotheses of the Lemma. 

Case F I- {B) e : B. Induction on e yields that TZ Dj_ £ £' (or else both deno- 
tations of e are ±). By confinement of e, as C 7^ Own and C is non-rep, we have 
£ ^ locs{Repi) and £' ^ locs{Rep'i). Thus, £' = £ hy Lemma 7.12. Hence either 
both semantics yield £, whence TZ B± £ £, or both yield _L and again TZ B± _L _L. 

Case F I- e is -B : bool. The argument is similar to that for type cast. 

Case F I- e./:T. By induction on e we have TZ C± £ £', hence £ = £' hy 
Lemma 7.12. In the non-± case, £ ^ nil. By closure of the heaps, £ € domh 
and £ e domh'. We consider cases on whether C < Own. Consider confining 
partitions {Ch * Ohi * Rhi . . .) = h and {Ch' * Oh'i * Rh'i . . .) = h' that have 
corresponding islands as in the definition of TZ Heap. In the case C < Own, we 
have £ € locs{Owni) and hence £ in some dom{Ohi). From TZ Heap h h' we have 

R{Ohi* hRepi) {Oh'i * hRep'^) 

and thus £ G dom{Oh'^) by basic coupling Def. 7.4(1). Since C 7^ Own, we know by 
visibility that / is not in the private fields g of Own. Thus, as type{f, loctype£)) = 
T, we have TZ T {h£f) {h'£f) by Def. 7.4(3) and Lemma 7.12. 

Finally, in the case C ^ Own (recall that C is non-rep and C ^ Own by hy- 
pothesis, we have £ e dom{Ch) and hence £ G doni{Ch') by definition TZ Heap. 
Hence TZ {state {loctype £)) {h£) {h'£) and thus TZ T {h£f) {h'£f) by definition of 
TZ {state {loctype £)). 

The remaining cases are straightforward. See Appendix. □ 
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Lemma 7.23 (preservation by commands) Suppose 7^ is a simulation, and 
moreover ji and ^! are confined method environments such that TZ MEnv /x /x'. 
Then the following holds for any non-rcp class C ^ Own. For any constituent 
command F h 5* in a method declaration in CT{C) and any {h,r]) and {h',ri'), if 
conf C {h, rj), conf C {h', r]'), and TZ {Heap ^ F) {h, rj) [h' , r]') then 

7^ {Heap ® F)^ ([F h Slii{h,r])) ([F h' 5]V(/i',r?')) • 

Proof. For any C, the proof is by structural induction on F h 5. 

Case F I- a; := e. As CT is confined, constituent e of the assignment is confined. 
So by Lemma 7.22 we have TZ T± d d! . Hence, hy TZT r] rj' and definition of TZ F, 

we have TZT [rj \ x^d] [rj' \ xi^d'] whence the result. 

Case F h ei./ := 62- By Lemma 7.22 for ei we have TZ C £ hence £ = i' 
by Lemma 7.12. By Lemma 7.22 for 62 we have TZ U d d' and hence TZ T d d' 
by Lemma 7.13, where {f :T) e dfieldsC as in the typing rule. To conclude the 
argument it suffices to show 

TZ Heap [h \ t^[hl \ f^d]] [h' \ £^[h'£ \ f^d']] . (*) 

Consider confining partitions ( Ch * Ohi * Rhi . . .) = h and ( Ch' * Oh'i * Rh'i . . .) = h' 
that correspond as in the definition of TZ Heap h h'. We argue by cases on C. 

— C < Own: Then loctype i < C < Own. Prom typing we have ei : C and hence 
there is some i with {£} = dom{Ohi) and by TZ Heap h h' we get 

R {Ohi* Rhi) {Oh'i*Rh'i) 

and so {£} = dom{Oh[) . By typing and C 7^ Ow;n, field / is not in the private 
fields g of Own. So (*) follows from TZ Heap h h' and TZT dd' . 
— C ^ Own: As C is non-rep, we have £ G dom Ch and thus £ G dom Ch' by 
hypothesis TZ Heap h h' . Moreover, TZ {state {loctype £)) {h£) {h'£) and so by 
TZT dd' we get TZ {state {loctype£)) [h£ \ f^d] [h'£ | fi-^d']. Hence (*). 

Case F I- x ;= new B. By confinement of CT, this command is confined and 
hence the final states are confined: conf C {ho, rjo) and conf C {H'q, 77q). We have C ^ 
Rep and C ^ Own. In the case C Own confinement of rjo and ?7g implies rng rjQ fl 
locs{Repl) = = rngrjQ fl locs{Rep' I). So £ ^ locs{Repl) and i' ^ locs{Rep' [), 
hence by typing B is non-rep. In the case C < Own, we have B non-rep by 
Assimiption 7.15 (no reps in sub-owners). Either way, B is non-rep so Lemma 7.10 
applies, to yield dom h n Iocs B — dom h' n Iocs B. Thus by parametricity of fresh 
we have £ = fresh{B, h) = fresh{B, h') = £' . So, by Lemma 7.12 and TZT rj -q' we 
have TZT rjo riQ. 

It remains to show TZ Heap^ ho h'o in order to get the final result TZ {Heap (g) 
r)-L {ho, rjo) {hojV'o)- We argue by cases on B. 

— B ^ Own: Writing fields' for the fields given by CT', we have fields B = fields' B 
and thus TZ {state B) [fields B 1— *• defaults] [fields' B 1-^ defaults]. So, as B is 
non-rep and B ^ Own, we can add £ to Ch and Ch' to get partitions that 
witness TZ Heap hi h'^. We also have conf B {hi,r]i) and conf B {h'i,r]'i) because 
conf h, conf h', and defaults do not contain any locations. Now by Lemma 7.21 
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we get TZ Heapj_ ho h^. Combining this with what was shown above we have 
n {Heap r)i {ho, r?o) (K, v'o)- 
— B < Own: By basic coupHng, Def. 7.9, we have TZ Heap^ ho h^. 

Case r I- a; := e.m(e). By Lemma 7.22 for e we have TZ D± i £', hence ^ = ^' by 
Lemma 7.12. Let 771 = [self ^, a; d] and 77^^ = [self 1-^ ^, a; ^ d ]. By confinement 
of a; := e.TO(e) (Def. 6.7) we have confined arguments, i.e., conf [loctype i) {h,rj\) 
and conf {loctype £) {h',r][). By Lemma 7.22 for e we have TZ U± d d and hence 
TZU d d as wo arc considering the non-± case. Thus TZ [x:T, self : loctype I] rji rj'i 
by Lemma 7.13. From TZ MEnv fi 11' we get 

TZ {loctype i,x,T^T) {^{loctype £)Tn) {/j,' {loctype i)m) 

hence, as h,h' ,rji,ri[ arc confined and related, TZ {Heap (E)T)± {hi,di) {h[,d[), 
where {hi,di) = ^{loctype t)m{h,r]) and {h'-^,d'i) = {loctype i)m.{h' ,r]'). Thus 
TZT d\ d\ and TZ Heap hi h\. It remains to show that the updated stores [ry | xi— > 
di] and [rf \ x^d'-^\ are related for F. This follows from TZT d\ d'^ and T < Fa;, 
using Lemma 7.13. 

The remaining cases are similar. See Appendix. □ 
8. APPLICATIONS AND FURTHER EXAMPLES 

In this section wc use the abstraction theorem to show some program equivalences 
for the examples discussed in Sections 2 and 3. Then we discuss further variations 
on the observer pattern. 

To establish the hypothesis of the abstraction theorem for the examples we use 
the couplings given as examples in Sect. 7.2. Both the theorem and these couplings 
are defined in terms of the semantics. To show that the couplings are simulations 
we argue directly in terms of the semantics. For practical purposes in program 
verification, the abstraction theorem would be expressed syntactically as a proof 
rule and rules for program constructs would be used to establish the simulation 
property [Reynolds 1981a; .Jones 1986; Morgan and Gardiner 1990; dc Roever and 
Engelhardt 1998]. Adequate proof rules for a language like ours remains an open 
challenge (see Sect. 12). 

8.1 Program equivalence 

We take program to mean a well formed class table CT together with a command 
F h 5. We consider the object states reachable; from variables of F to be the in- 
puts and outputs of the program. For example, if S is the body of method main in 
Sect. 2.1 then F is self :Main and what can be reached is self and the string self.inout. 
We restrict attention to confined programs, meaning that CT and F h S' are con- 
fined. Thus, by Theorem 6.17 the method environment |CT] is confined. To prove 
program equivalence using the abstraction theorem, we need to both introduce and 
eliminate a simulation. Elimination is by identity extension Lemma 7.17 and intro- 
duction is by the related Lemma 7.18. There is a small technicality: To establish 
the hypothesis of Lemma 7.23, we require w.o.l.o.g. that S occurs in some method 
of CT. 
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We compare programs only for class tables CT, CT' that are comparable in the 
sense of Def. 7.1, and with commands in the same context T. As commands denote 
functions on global states, the obvious notion of equivalence is that |r h S} and 
[r h' S'Y are equal as functions. By Lemma 7.11, {T\ = |r]' for any T, but 
in general the semantic domains differ for owner object states which may have 
different private fields. A global state (ft-, rj) G \Heap ® F] for CT need not be an 
element of {Heap ® FJ' for CT' . However, an Own- free heap in |ifeap] is also an 
element of \Heap\' ■ So we compare command meanings on the Own-free states. 

Definition 8.1 (client program equivalence) Suppose programs OT, (F h S) 

and CT', (F h' S") arc such that OT, CT' are comparable and confined, and more- 
over S (resp. S') occurs in CT (rcsp. CT'). The programs arc equivalent iff 

collectilT h S}fi{h,r])) = collect{lV h' S'}' fi' {h,rf)) 

for all confined and Own- free € \Heap 'S>T\, where /i = \CT\ and fi' = 

ICT'l'. □ 

If F self < Own then 77 cannot be Own-free. The resulting vacuous quantification 

makes the definition equate all commands for such F. But we arc only interested 
in using the definition for clients. Simulation is the relation of interest between 
owners. 

The static analysis for confinement Sect. 11 can be used to show that each of the 
following examples is confined for the appropriate Own and Rep. 

Example 8.2 Consider the command S comprising the body of method main of 
class Main in Sect. 2.1 and take F = (self: Main). As CT we take the declarations 
of Main, Bool, and the first version of OBool. For CT' we use the second version 
of OBool. Let Rep and Rep' be Bool and Own be OBool. 

To show that CT, (F h S) is equivalent to CT' , (F h 5), recall the basic coupling 
of Example 7.5 and let TZ be the induced coupling. Let {h, rf) be any confined state 
for F, noting that Main ^ Own so t] is Own-free. Let fx = {cTj and /t' = {CT'}'. 
To show 

collectilT h Sjfi{h, T])) = coUectilr h' S'j'fi'{h, t])) , (*) 

note first that TZ {Heap (g) F) (/i, r/) {h,r]) by Lemma 7.18. It is straightforward to 
show that TZ is established by the constructors and preserved by the methods of 
OBool; thus 7?. is a simulation. The abstraction theorem yields TZ MEnv fi fi' . This 
in turn justifies application of the preservation Lemma 7.23 to command S, as its 
context Main is a non-rep class ^ Own. Thus the outcomes |F h S\fi{h, rj) and 
[F I-' S'Yfi'{h, vj) are related by TZ. By definition of TZ, either both outcomes are _L, 
in which case (*) holds by definition of collect, or the outcomes are non-_L states 
(ft-o- na) ^iiifl (h'o' V'o) with TZ {Heap F) {ho, r?o) {It-'q, rj'o). Note that ho and H'q each 
contains at least one owner, the one constructed in S. But Main % Own, so rng rjo 
and rngrj'o are Own-free. Moreover, the owners were reached only by variable z 
which is local in 5*; they are not reachable via fields of the objects ho{ri5e\f) or 
h'o{r]' se\f) . That is, both collect{hQ,r]Q) and co/Zect(/iQ, ?7o) are Own-free. Thus 
by identity extension Lemma 7.17 we have collect{ho,rio) = collect{h'Q,rj'Q) which 
concludes the proof of (*). □ 
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This proof depends on parametricity of the aUocator, because that is needed 
for the abstraction theorem. The same argument will go through, however, for the 
second abstraction theorem in the sequel which drops parametricity of the allocator. 

Example 8.3 Recall the Meyer-Sieber-O'Hearn example from Sect. 3.1, and in 

particular the command 

C y := new C in A x := new A in x.callP(y) (if) 
Take (J) to be the body of method main in 

class Main extends Object { unit main(){ . . . } } 
To be very precise we need to include a class 

class Rep extends Object { } 

so we can take Bep and Rep' to be Rep which is not comparable to the classes C 
and A of interest. Let Own be A. Let CT consist of the declarations of A, Rep, 
Main, and an arbitrary class 

class C extends Object { unit P(Az){ . . . } . . . } 

such that methods of C satisfy the confinement conditions. Then CT and CT' 
are confined, because no reps are constructed or manipulated. We use the basic 
coupling of Example 7.6. To appeal to the abstraction theorem, we must argue 
that 7?. is a simulation. The constructors are skip and the default value for 
field g establishes the relation. Preservation by inc is straightforward because both 
versions have the same code and it makes no method calls. We give the details 
for preservation by callP. The relevant condition is Def. 7.9(2a). To show it for 
callP, suppose i > and TZ MEnv /Zj |J,'^. Note that /Uj and |J,'^ are confined, by 
Theorem 6.17. Suppose that TZ {Heap'S)y : C, self : A) {h, rf) (h' , rj') with conf A (h, rj) 
and conf A{h' ,r]'). In both versions of callP, the body is a sequence and the first 
command is y.P(self). Let r]i = [z self, self i—> ijy] and rj'i = [z ry' self, self 
Tj' y] be the environments for semantics of this call. By definition of TZ wc get 
TZ {Heap ^ z : A,5e\f :C) {h,r]i) {h',r]'-^). From the hypothesis conf A{h,r]) we get 
conf C {h, r]i) and likewise conf C {h', r]'i). Applying the hypothesis TZ MEnv jj, ji' to 
these environments wc get that either ^CP{h, 771) = ± = iJ,CP{h, rji) or neither are 
_L and TZ {Heap unit) {ho, it) {H'q, it) where {ho, it) = iiCP{h, rji) and {h'o, it) = 
n'CP{h' ,r]'i). The call is desugared to an assignment of the result value to a local 
but the value is discarded for both versions, so the states following the calls are 
{ho,r]) and {h'^,?]') and we have TZ {Heap (S> y- C , self : A) {ho,!]) {h'Q,r]'). In these 
states we have hoig = h'o£g A ho£g mod 2 = 0. So the command 

if self.g mod 2 = then abort else skip fi 

aborts, as does its counterpart which is simply abort. This concludes the argument 
that the bodies of callP are related. 

Having established the antecedents of the abstraction theorem, we conclude that 
the command (J) preserves TZ. By semantics of the second version of A we know 
callP aborts, so both interpretations of (|) abort. The programs are equivalent. □ 
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This example is handled without using the identity extension Lemma 7.17, but 
that is only because the example uses abortion. In subsequent examples the proof 
needs all the steps of the one for Example 8.2. The steps are not spelled out in 

detail; only the interesting bits are highlighted. 

Example 8.4 We consider the observer pattern, taking Own to be Observable. Let 
CT be given by the first version. Fig. 2, together with the client given in Fig. 3. 
Let CT' be given by the sentinel version of 4 together with Fig. 3. We consider 
equivalence for the command self :Main,ob:AnObserver h S where S is the body of 
Main. main. Because obi is local to S, no owners are reachable in the final state. 

Taking Rep, Rep' to be Node,Node2, we use the coupling relation of Example 7.7. 
Clearly the constructors establish the relation. To show that method add preserves 
it, note that the bodies of these methods are both sequential compositions; both 
construct a new node and then set its ob field to the value passed as a parameter. 
The next step is to add it to the beginning of the list; the difference between the 
two versions is that self.snt.nxt is assigned in Fig. 4 whereas self.fst is assigned in 
Fig. 2. Both versions of add then invoke methods on the new node n. In practice 
one would argue in terms of the behavior of those methods. Note that they need not 
preserve the relation; it is just that their behavior is used to maintain the relation. 
To give a precise argument in terms of the semantics, we consider cases on i. For 
i = 0, both and /U- make every method abort, in which case the body of add 
aborts due to method calls. As the methods in class Node and class Node2 are not 
recursive, their semantics is already completely defined for z = 1, so for i > the 
behavior of add is to insert nodes at the head of the list, maintaining the relation. 

The remaining owner method is notifyAII. Again, the two versions arc similar 
except for skipping over the sentinel node. To argue that the calls to getNext act 
correctly one considers cases as in the proof for add. For the calls to notify on 
the Observer objects, recall that by the relation, the related lists contain the same 
Observer pointers in the same order. The two versions thus make the same series 
of invocations of notify. Each of those calls preserves the relation by hypothesis 
7^ MEnv iJ.i n'i- □ 

The last step of the argument, concerning invocations of notify, is like reasoning 
about invocations of P in Example 8.3. This example has the additional compli- 
cation of calls to objects within the owner island. The case distinction between 
i = and i > is needed because our argument is purely in semantic terms. In a 
practical proof system, one would reason only in terms of the actual semantics of 
the methods involved rather than its approximants. 

Strictly speaking, use of Lemma 7.23 depends on desugaring the examples, and 
the desugarings Remark 4.1 do not include loops. We return to this issue in Sect. 9. 

Example 8.5 Suppose we change the client of Fig. 3 to use the following. 

class AnObserver extends Observer { unit notify(){ skip } } 

Then in Fig. 4 we can replace the body of Observable. notifyAII by skip and still 
have equivalence with the implementation of Fig. 2. What changes with respect to 
Example 8.4 is that the two implementations do not make the corresponding calls 
to notify. But because AnObserver. notify is skip, calling it has the same effect as 
not calling it; in particular, the relation is preserved. 
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class Model extends Object{ // rep for Observable 

Observer ob; 
Nodel nxt; 

unit setOb(Observer o){ self.ob := o } 
unit add(Nodel n){ 

Observer o := n.ob; n.ob := self.ob; self.ob := o; n.nxt := self.nxt; self.nxt := n } 
unit notifyAil(){ self.ob. notify(); if self.nxt ^ null then self.nxt. notifyAII() else skip fi } } 
class Observable extends Object{ // owner 
Nodel fst; 

unit add(Observer ob){ 

Nodel n := new Nodel; n.setOb(ob); if self.fst = null then self.fst := n else self.fst.add(n) fi} 
unit notifyAII(){ if self.fst ^ null then self.fst.notifyAII() else skip fi} } 

Fig. 8. Version of the observer pattern in object-oriented style: nodes are active. 

class Node3 extends Object { // rep for Observable 
Node3 nxt; 
unit notif(){ skip } 

unit notifyAII(){ self.notif(); if self.nxt ^ null then self.nxt.notifyAII() else skip fi } 
unit add(Observer ob){ NodeO n := new NodeO; n.setOb(ob); n.nxt := self.nxt; self.nxt :=n } } 
class NodeO extends Node3 { // rep subclass 

Observer ob; 

unit setOb(Observer o){ self.ob := o } 
unit notif(){ self.ob. notify() } } 
class Observable extends Object{ // owner 
NodeS snt; 

con{ self .snt := new NodeS } 

unit add(Observer ob){ self.snt.add(ob) } 

unit notifyAII(){ self.snt.notifyAII() } } 

Fig. 9. Sentinel in object-oriented style. In class Node3, method add constructs an object of the 
subclass NodeO and method notifyAII uses dynamic dispatch of notif. 

The argument here is not modular: by contrast with the preceding example, here 
we reason directly in terms of the client code. □ 

8.2 Further variations on observer 

Fig. 8 gives another implementation of Observable, using a singly linked list but 
with most of the work delegated to methods of Nodel. Method add of class Nodel 
in the Figure is an example of class-based visibility: The private fields of object n 
are both assigned and read. 

Unlike the example of Sect. 3.1, where method P is called once by callP, method 
Observable. notifyAII invokes notify on multiple objects — and multiple times if some 
of those are aliases. By sharing state, it is possible for multiple observers to detect 
the order in which they are notified. In our versions of Observable, method add 
maintains the set in last-in order. In Fig. 8, method add in Nodel shuffles pointers 
to maintain the last-in order. 

A less awkward version, using a sentinel node, is given in Fig. 9. 

The following example indicates the limits of what can be proved using the ab- 
straction theorem. For this discussion, instead of treating loops as syntactic sugar 
we assume they are in the language. The semantic clause would use a fixpoint 
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but this is separate from the fixpoint of the approximation chain used for method 
meanings. Thus for each i > the full semantics of a loop is defined in /Uj. 

Example 8.6 Consider the versions given by Figs. 2 and 8. The data structures 
are very similar; essentially the identity coupling can be used. (It is not literally 
the identity, because because Node and Nodel are distinct classes and thus the 
sets |Node] and |Nodel] have no non-nil location in common. But that is just the 
reflection of a coding trick in our formalization of semantics.) The bodies of add 
and notifyAII in the two versions have significant differences in the calling graph, 
and in particular notifyAII in one version uses a loop whereas in the other it calls 
a recursive method in Nodel. To reason about these would require proving a loop 
invariant and verifying specifications for methods add and notifyAII in Nodel. But 
for this one wants the final semantics of the program, not the approximate one 
given by /Zj and /zj. For given i, the semantics of notifyAII is only defined up to 
recursion depth i; for a list longer than that, the loop in Fig. 2 works correctly but 
the recursion in Fig. 8 aborts. 

By contrast, equivalence between Figs. 4 and 8 can be shown by an argument 
similar to that in Example 8.4. They do not have the same method call graph, but 
the called methods are not recursive so one can argue by cases for z = and z > 0. 

If the loop in Fig. 2 is treated as syntactic sugar for a method call then the 
equivalence has a complicated proof in terms of corresponding unfoldings of the 
semantic approximations. But this is an accidental feature of the example. □ 

Example 8.6 might lead one to wonder whether there is a flaw in the definition 
of simulation. Instead of requiring that owner methods preserve the relation given 
any approximating and related environments perhaps it should be enough 

to consider the final semantics |CT], |Cr']'. But this is not a sufficiently strong 
induction hypothesis to prove the abstraction theorem. In fact the example refiects 
a limitation in most theories of simulation and logical relations: what can be shown 
equivalent are programs with the same structure in some sense; see Sect. 12. 

Example 8.7 Equivalence between the versions given by Figs. 8 and 9 can be 
shown by an argument similar to that in Example 8.4. The basic coupling is 
like that of Example 7.7 with minor changes: Rep, Rep' are named Nodel, Node3 
and the sentinel is at location £'q S Zocs(Node3) whereas the locations £[,£'^,... 
following it are in /ocs(NodeO). The method call graphs are not identical for the 
two versions and dynamic dispatch is used in the second version for NodeS.notif. 
But the diflerences involve non-recursive methods and it suffices, as in Example 8.4, 
to consider two cases for /Uj, /i^, namely i = and i > 0. □ 

9. OWNER SUBCLASSING: THE PROTECTED INTERFACE 

This section considers examples involving subclasses of the owner class. Rather 
than formalizing the "protected" construct of Java, we address the issues using a 
module construct. We augment the syntax to designate certain methods as having 
module scope, meaning that they cannot be called by clients. The confinement 
conditions for these methods are relaxed. 
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9.1 Owner subclassing and module scope 

The code for notifyAII in Observable of Fig. 2 uses a loop. Here is an equivalent 
version using a tail recursive helper method doNotif. 

unit notifyAII(){ doNotif(self.fst) } 
unit doNotif(Node n){ 

if n ^ null then n.getOb().notify(); doNotif(n.getNext()) else skip fi } 

In a language with nested method declarations, doNotif could be declared within 
notifyAII. Absent that, it could be given private scope, allowing its calls only in 
Observable. But the language of Sects. 4-8 has only public methods. To apply our 
abstraction theorem to the desugared version we would have to include a suitable 
implementation of doNotif in every version. This can be done for the examples in 
this paper, but it is awkward. 

In Sect. 9.3 we add module-scoped methods to the language. These suffice for 
desugaring loops and for interactions between reps and owners. In the sequel we 
focus on their use in subclasses of Own. 

Fig. 10 is a variation on the observer pattern in which class Observable has sub- 
class Observable Acc. For accounting purposes it keeps track of the number of times 
each observer has b(x;n notified. To this end, the rep class NodeAcc overrides 
method notifyAII of the client class Node4. Such examples have led to our treat- 
ment of owner subclasses: They are distinguished from clients in that their methods 
may manipulate reps, but unlike Own they cannot store reps in fields. 

Method addn has been added to Observable, so that ObservableAcc can construct 
reps of the subtype NodeAcc and install although fst is a private field not visible 
in ObservableAcc. Method Observable. getFirst is also added for this purpose. But 
getFirst leaks a rep; it cannot be allowed in the public interface. One possibility 
is to treat getFirst and addn as visible only in subclasses of Observable. Instead, 
we give them module scope, meaning that calls to getFirst and addn are allowed in 
subclasses of both Observable and Node4. 

Method add in class ObservableAcc constructs a rep, violating the condition "no 
reps in sub-owners" in Assmnption 7.15. That assumption is needed for the first ab- 
straction theorem because methods of an owner subclass are like clients in that they 
must preserve the induced relation. That means in particular that they manipulate 
related i.e., equal rep locations. (By contrast, methods of Own preserve the 
basic coupling which need not impose a correspondence on rep locations.) But if we 
compare two versions, one with sentinel node and one without, the parametricity 
condition for fresh will not apply and the new objects in ObservableAcc. add will 
be at different locations. The solution, given in Sect. 10, is to relax equality to 
bijection. 

This relaxation is needed anyway, to avoid unobscrvablc distinctions. As an 
example, suppose we add to class Observable in Fig. 2 the following method: 

String version(){ result := new String("vsn 0") } 

Consider an alternative that is identical in every way except for the following: 

String version(){ result := new String( "trash" ); result := new String("vsn 0") } 

ACM Journal Name, Vol. V, No. N, Month 20YY. 



February 1, 2008 • 59 



class Node4 extends Object { // rep for Observable 
Observer ob; 
l\lode4 nxt; 

unit setOb(Observer o){ self.ob := o } 
unit setNext(Node4 n){ self.nxt;= n } 
Observer getOb(){ result ;= self.ob } 
Node4 getNext(){ result := self.nxt } 
Node4 getNextPri(){ result := self.nxt } 

unit notifyAII(){ self.ob.notify(); if self.nxt ^ null then self.nxt.notifyAII() else skip fi } } 
class NodeAcc extends Node4 { 
int notifs; 

unit notifyAII(){ self.notifs ;= self.notifs+1; super. notifyAII() } 
int notifications(Observer o){ result ;= 0; 
if self = self.getOb() then result := notifs 

else if self getNext() ^ null then result := (NodeAcc)(self.getNext()).notifications(o) 
else skip fi }} 

class ObservableSup extends Object { // superclass of owner; "abstract" class 

unit add(Observer ob){ abort } 

unit notifyAII(){ abort } 
class Observable extends ObservableSup { // owner 

Node4 fst; // first node of list 

Node4 getFirst(){ result := self.fst } // module scope 

unit add(Observer ob){ Node4 n := new Node4; self.addn(ob,n) } 

unit addn(Observer ob, Node4 n){ n.setNext(self.fst); n.setOb(ob); self.fst := n } // module scope 
unit notifyAII(){ self.fst.notifyAII() } 
class ObservableAcc extends Observable { 

unit add(Observer ob){ Node4 n := new NodeAcc(); self.addn(ob,n) } 

int notifications(Observer ob){ result := ((NodeAcc)(self.getFirst())).notifications(ob) } } 

Fig. 10. Version with owner and rep subclasses and super-call. The owner also has a superclass. The 
two versions of getNext in Node4 are needed for later examples. 

According to Dcf. 7.9, the induced relation for locations of type String is equality. 
But, even if the allocator is parametric, the locations returned by these two methods 
are not equal. (So condition (2a) fails in Def. 7.9 of simulation.) But they cannot 
be distinguished; this claim is justified by the generalized theory of Sect. 10, where 
the induced relation allows an arbitrary bijection between locations of client types 
like String. For this example, the bijection would be extended to relate the returned 
results from the two versions. 

Returning to the example in Fig. 10, the interface betweeen Observable and its 
subclass ObservableAcc is awkwardly designed. An improvement is to use the fac- 
tory pattern [Gamma ct al. 1995] so that add itself can be inherited. In Fig. 11, we 
add method makeNode, which should have module scope, and remove addn. 

To illustrate that owners may reference each other, let us add a method allNoti- 
fications which reports the number of times a given observer has been notified by 
any observable in a group thereof. In the code of Fig. 12, groups are represented by 
cyclic lists. An ObservableAccG is initially in a singleton group; groups grow using 
method joinGroup. 

These examples show subclasses of reps and owners. There is inheritance into 
the owner but not into the rep. Inheritance into reps is disallowed by our definition 
of confined class table, because to handle it requires a more sophisticated analysis 
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class Observable extends ObservableSup { 

Node4 fst; 

Node4 getFirst(){ result := self.fst } // module scope 

Node4 makeNode(){ result := new Node4 } // module scope 

unit add(Observer ob){ Node4 n := makel\lode(); n.setNext(self.fst); n.setOb(ob); self.fst ;= n } 
unit notifyAII(){ self.fst. notlfyAII() } } 
class ObservableAcc extends Observable { 

Node4 makeNode(){ result — new NodeAcc } // module scope 

int notifications(Observer ob){ result := ((NodeAcc)(self.getFirst())).notifications(ob) } } 

Fig. 11. Variation on Fig. 10 using factory pattern. Node4 and NodeAcc are as in Fig. 10. 

class ObservableAccG extends ObservableAcc { 
ObservableAccG peer; 
con{ self, peer := self } 

unit joinGroup(ObservableAccG o){ // pre: self.peer=self and o.peer is cyclic list of length > 1 

self.peer := o.peer; o.peer := self } 
int allNotifications(Observer ob){ 

result ;= self.notifications(ob); ObservableAccG p := self.peer; 

while p ^ self do result := result + p.notifications(ob); p := p. peer od } } 

Fig. 12. Extension of Fig. 10 or Fig. 11 with grouped owners. 

class Observable extends ObservableSup { 
Node4 snt; 

con{ snt ;= new Node4 } 

Node4 getFirst(){ result := self.snt.getNextPri() }// module scope 
Node4 makeNode(){ result := new Node4 } // module scope 
unit add(Observer ob){ 

Node4 n := makeNode(); n.setNext(self.snt.getNextPri()); n.setOb(ob); self.snt.setNext(n) } 
unit notifyAII(){ self.snt.getNextPri().notifyAII() } } 



Fig. 13. Variation on Fig. 11 using sentinel. 

to prevent leaks via self; a suitable analysis of "anonymous methods" is discussed 
in Sect. 12. Inheritance into owners also needs restriction; we have chosen a simple 
restriction that nonetheless allows the preceding examples. 

Finally, let us consider an alternative version of Fig. 11 to illustrate the conse- 
quences of allowing the owner class, but not its subclasses, to differ in comparable 
class tables. In Fig. 11 the subclass ObservableAcc manipulates reps, both con- 
structing a new NodeAcc and invoking method notifications declared in NodeAcc. 
Although an alternative version of Observable could use an entirely different type 
of nodes internally, it has to provide method getFirst with return type Node4. Be- 
cause clients can manipulate objects of class ObservableAcc, methods of that class 
must preserve the relation and this only holds if methods they invoke preserve the 
relation. So coupling must be preserved not only by public methods of Observable 
but also by those module scope methods that are invoked in ObservableAcc. As 
a simple example. Fig. 13 gives an alternative that uses Node4 and differs from 
Fig. 11 only in using a sentinel node. 
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9.2 On behavioral subclassing 

Behavioral subclassing [Liskov and Wing 1994] is very useful for reasoning about 
specific examples. However, as mentioned earlier, it is not required in general for 
representation independence. Client, rep, or owner subclasses may fail to exhibit 
behavioral subclassing. To illustrate the point let us consider two revisions of 
Fig. 10, both of which violate behavioral subclassing. For the first example, we add 
an overriding declaration to NodeAcc: 

Node4 getNext(){ abort } 

This causes NodeAcc to fail to be a behavioral subclass of Node4 by most definitions. 
(It also prevents the intended functioning of the added method NodeAcc. notifications 
and its callers). Nonetheless, there is still a simulation between Figs. 11 and 13.^"^ 
Making this true is the reason Fig. 13 uses getNextPri instead of getNext. 

The second revision makes malicious use of a type test. We add nothing to 
NodeAcc, but rather revise Node4 as follows: 

unit notifyAII(){ if self is NodeAcc then abort else self.ob.notify() fi; 

if self.nxt ^ null then self.nxt.notifyAII() else skip fi } 

Method notifyAII in NodeAcc now fails to behave properly. In some sense, the 
revised Node4 is non-monotonic with respect to subclassing. Again, there is still 
a simulation between Figs. 11 and 13. Method notifyAII aborts for ObservableAcc 
objects in both versions. 

9.3 Fornnalization of nnodule-scoped nnethods 

In Sect. 8 we saw the need for methods that are effectively private to Own, for 
desugaring loops, and also for methods in Own that cannot be called by clients but 
can be called in subclasses of Own. There is also a need for methods of owners and 
reps that can be called by each other but not by clients. For simplicity, we address 
these needs with a simple notion: Own, Rep, and their subclasses are considered 
to be inside a module, and methods may be designated as being visible only inside 
the module. 

To avoid belaboring the formalization, we make no change to the concrete syntax. 
We assume that a class table designates the class names Own and Rep and is 
equipped with a predicate mscope with the interpretation that mscope(m, C) means 
this method has package scope. The following changes are made to the definitions 
of preceding sections. 

(1) For a well formed class table, mscope must satify conditions that reflect what in 
practice would be achieved by declaring Rep, Own, and their subclasses inside 
the module. If mscope{m, C) then 
— C < Own or C < Rep, 

— mtype{m, B) is undefined for B > Own and B > Rep, and 
— B <C implies mscope{m, B). 



^^Hcrc wc consider a class table comprised of Node4, NodeAcc, and ObservableSup from Fig. 10, 
along with the overriding declaration NodeAcc. getNext and also Observable and ObservableAcc 
from Fig. 11. The alternative class table is the same except for using Observable from Fig. 13. 
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(2) The typing rule for method call has an added restriction that module-scoped 
methods are only visible within the module: 

T\-e:D mtype{m, D) = T^T 
rheiF U <T x^self T <T x 
mscope{m, £>) F self < Own V T self < Rep 
T > X := e.m(e) 

(3) For method environments, the confinement condition of Def. 6.5(1) is replaced 
by the following: 

— C < OwnAmscope{m,C) => conf C {ho,T])Ah < hoA{d G locs{Repi) d G 

dom{Rhj)) for some confining partition and j with r^self G dom{Ohj) 
— C ^ Rep A (C ^ Own V -'mscope{m, C)) conf C {ho, if) Ah < Hq Ad ^ 

locs{Repl) 

(4) For confinement of class tables, the restriction of Def. 6.9(3) is only apphed to 
methods with -^{mscope{m, C)). 

(5) For simulation, Def. 10.10 in the sequel revises Def. 7.9(2) to require preserva- 
tion of the relation only for public methods, that is, if -^{mscopeim, Own)). But 
those module-scoped methods that are called in sub-owners must also preserve 
the relation. 

To formalize this, we define prot{m, C) just if C < Own, mscope{m. Own), and 
there is a call to m in some subclass of Own. 

(6) Comparable class tables must agree on the public and protected methods of 
Own. Def. 7.1(1) is extended to require that mscope{m,C) = mscope' {m,,C) 
for all C 7^ Own. Moreover, if mtype{m, Own) is defined then the following 
hold (and mutatis mutandis for mtype'): 

— -imscope{m, Own) implies mtype! (m, Own) = m,type{m, Own) and -^mscope! (m. Own), 
and 

— prot{m. Own) implies mtype! (m. Own) = mtype{m. Own) and mscope' {m. Own) 
(which in turn implies prot!{m. Own)). 

Example 9.1 Method doNotif in Sect. 9.1 can be given module scope. It would not 
be called in owner subclasses, so it is not required to be present in a comparable class 
table. Method getFirst of Observable in Fig. 10 is called in subclass ObservableAcc, 
so prof(getFirst, Observable) holds and getFirst must be present in a comparable 

class table (and be simulated). □ 

Results of Sections 5 and 6 hold for the extended language; the only proof affected 
by the changes is that of Theorem 6.17 which says that |CT] is confined if CT is 
confined. The result holds for the revised definitions; the necessary revisions for 
the proof are as follows: 

— In the base case of the induction on depth, the argument proving confinement 
of Hi^iCm for the result value d goes by cases on C. The argument for the 
case C < Own still holds for m with -^mscope{m.C). For the case C < Own 
and mscope{m, C), the revised definition requires the result value d to satisfy 
d e locs{Rep\.) e dom{Rhj) for some confining partition and j with ryself G 
dom{Ohj). This follows by definition from conf C (/iOj%)- 
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— In the step of the induction on depth, there is case analysis on C and iJ, proving 
claim conf B {h, rf) and confinement of the result value d. For the case C < 
Own < B, the argument still holds, noting that -^m,scope{m, C) because in a well 
formed class table module-scoped methods do not occur outside owner and rep 
classes. For the cases C < B < Own and C < B < Rep, the arguments still hold, 
noting that the restrictions on niscope ensure mscope{m, B) = mscope{m, C) so 
the relevant conditions arc the same. 

10. SECOND ABSTRACTION THEOREM 

This section improves the first abstraction theorem in two ways. First, the result 
applies to the language extended with modules (see Sect. 9.3). The module-scoped 
methods of the two versions of Own can be different unless they are used in sub- 
classes of Own. The second improvement is that parametricity of the allocator is 
no longer required (cf. Sect. 7.3). To compare behaviors of two versions of a pro- 
gram we use a bijection between locations rather than equality. This can be seen as 
expressing that the language is parametric in locations, which would fail if the lan- 
guage had pointer arithmetic. As discussed in Sect. 9.1, allowing bijection handles 
the problem with new reps in sub-owners that necessitates Assumption 7.15. More- 
over, it allows coarsening of the notion of equivalence for commands and method 
meanings so that, for example, the bodies of the two versions of method version in 
Sect. 9.1 are equivalent. 

These extensions are enough to treat all the examples in Sect. 9.1 in addition to 
those of Sect. 8 (except Example 8.6, for reasons discussed there). 

Definition 10.1 (typed bijection) A typed bijection is finite bijective function 
a from Locs to Locs such that ai = i' implies loctypel = loctypel'. □ 

Throughout the section we let a range over typed bijections and sometimes omit 
the word "typed". To express how bijections cut down to bijections on blocks of 
partitions, we use the notation <t{X) for the direct image of X through a. 

Definition 10.2 (basic coupling) Given comparable class tables, a basic cou- 
pling is a function G that assigns to each typed bijection a binary relation Ga on 
heaps (not necessarily closed heaps) that satisfies the following. For any a,h,h', if 
G a hh' then there are partitions h= Oh* Rh and h' = Oh' * Rh' and locations t 
and t' in locs{Owni) such that 

(1) a(.= e and {i} = dom Oh and } = dom Oh' 

(2) dom{Rh) C locs{Repi) and dom{Rh') C locs{Rep'i) 

(3) g a {type{f,loctypet}) {Mf) {h'e'f) for all (/:T) G dom{fields{loctype t)) with 
f = dom{dfields{Own)) and f ^g' = dom{dfields' (Own)) . □ 

Item (3) uses the induced coupling Q defined below; it is a harmless forward ref- 
erence because the definition of Q for data types does not depend on Q (or G) for 
heaps. Note that we do not require dom a to include the reps, nor do we disallow 
that it includes some of them. 

Definition 10.3 (coupling relation, CJ) In the context of a basic coupling with 
given relation G, and for each typed bijection cr, relations Q a 6 C. |^| x |^|' as 
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follows. (Note that in the case of method meanings and method environments 
there is no dependence on a.) 

For heaps h, h\ we define Q u Heap h h' iff there exist confining partitions of 
h, h' , with the same number n of owner islands, such that 

— dom a C dom h and rng a C dom h' 

—G a {Ohi * Rhi) {Oh'i * Rh'i) for all i in l..n 

— a{dom{Ch)) = dom{Ch'), i.e., a restricts to a bijection between dom{Ch) and 

dom{Ch') 

—g a {state {loctype £)) (M) for all ij' with £ G dom{Ch) and a(.l' 

For other categories 6 we define Q a as follows. 
Q a bool d d' <^ d = d' 

Q G unit dd! d = d! 

Q a C dd! ^ a d = d' W d ^ nil = d' 

Q a T r] r]' <i=> Va; € dom T • Q a (Fx) (r/x) (rj'x) 

Q a {state C) s s' ^ 

C ^ Own A V/ e dom{fieldsC) •Qa {type{f, C)) {s /) (s' /) 
g a {e±) a a' ^ {a = ± = a') V {a ± a' A g a 6 a a') 

g a {Heap «) F) {h, rf) {h' , rj') ^ g a Heap h h' hg a T -q ri' 
g a {Heap ® T) {h, d) {h' , d') ^ g a Heap hh' Ag aT dd' 
g {C,x,T^T) dd' <^ Vct, {h, 77) e [Heap (g> F], {h', ry') G [Heap (g) FJ' • 

g a {Heap (g) F) {h, rf) {h', rj') A conf C {h, rf) A conf C {h', rj') 

^ 3cro D fj • fjo {Heap (g) T)j_ {d{h, r/)) {d'{h', r/')) 

where F = [x T, self 1-^ C] 
g MEnv jjL ji' ^ WC,m» 

{-^mscope{m, C) V prot{m, C)) A {C is non-rep) A {mtype{m, C) is defined) 

=> ^ {C,pars{m,C),mtype{m,C)) {nC m) {n' C m) □ 

(Recall that prot is defined in (5) of Sect. 9.3.) 

As an example, the body of makeNode in ObservableAcc (Fig. 11) returns a new 
rep. Consider a coupling with a version using a sentinel. Given a bijection a 
and related heaps h, h' , the location i = /re,s/i(Node4, h) may be different from 
£' = /res/i(Node4, h') even if fresh is parametric, because h' has extra reps, the 
sentinels. But a can be extended with the pair {i,£'). 

The following facts are straightforward consequences of the definition. The first 
says that if h and h' are related by g at a, then a is a bijection between the domains 
of h and h' except for reps. 

Lemma 10.4 For all a, h, h' and all i, £' not in locs{Repi, Rep'i), ifg a Heap h h' 
then a{{domh) [ {locs{Repi,Rep'i))) = {domh') [ {locs{Rep[,Rep' [)). □ 

Lemma 10.5 If C7 < T and ^ ct C7 d (F then g aTdd'. □ 

For equivalence of values and states, we define a family of relations indexed on 
categories 9. To streamline the notation, we say "x x' in |^]" here, and simply 
use the symbol ~ct later. 
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Definition 10.6 (value equivalence) For any a, we define a relation for data 
values, object states, heaps, and stores, as follows. 



i^a I' 


in 




<^ 


ai = ey i = nil = e 




in 


m 




d = d' for primitive types T 


S s' 


in 


{state C\ 




•^f efieldsC*sf r^^s'f 




in 
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<^ 


Vx e dom r • X ~(7 r]' x 


h h' 
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{Heap} 
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in 


{Heap ® r] 


<^ 


h ~cr h' Ar] V 


d d' 


in 




<^ 


d = J. = d' V {d ± d' Ad d' in [6']) 



Lemma 10.7 (identity extension) Suppose Q a {Heap (g) F) (/i, r?) {h',r]') and 
F self is non-rep. Let (ft.,??) and {h\rj') be confined at Fself. If both collect{r},h) 

and collect{ri' , h') arc Own-frcc then collect(ri, h) collector]' , h'). □ 

The reader may care to check that in the case that a is equality, the relations 
Q a 6 coincide with TZ 9 and ~o- is just equality. 

Definition 10.8 (client program equivalence) Suppose programs CT, (F h S) 

and CT', (F h' S") arc such that CT, CT' arc comparable and confined, and more- 
over S (resp. S') occurs in CT (resp. CT'). The programs are equivalent iff for all 
confined. Own-free {h, rj) and {h' , rj') in {Heap (g) F] and all a with (/i, r;) {h' , rj'), 
there is some ao ^ (t with 

co/Zeci([F h Sjfi{h,r])) ~<,„ co/Zeci(|F h' r?')) . 

where /t = [CT] and /l' = [CT'l'. □ 

Lemma 10.9 Suppose C and all class names in T are non-rep, and B < C. If 
g {C,x,T^T) dd' then ^ {B,x,T^T) {restr{d,B)) {restr{d' , B)) where resir is 
the restriction to global states of B (see Def. 5.5). □ 

As discussed in Sect. 9, the relation must be preserved not only by public methods 
but also by any module scope methods that are called by methods declared in 
subclasses of Own. 

Definition 10.10 (simulation) A simulation is a coupling relation Q such that 

(1) (constructors of Own establish Q) For any /i, /u', any in locs{Owni) with 
a £ = i' , and any /i, h' with Q a Heap h h' , let 

hi = [h \ £i—>-\fields{loctype£) defaults]] 

h[ = [h' I £' [fields' {loctypei') i— > defaults]] 

ho = fself : {loctype £) h constr{loctype £) : conljl{hi,[se\i i-^ £]) 

h'o = [self: {loctypei') h' constr{locty pe £'): conf fi'{h[, [se\f ^ £']) 

Then there is ctq 2 cr such that C <t Hq Hq. 

(2) (methods of Own preserve ^/) Let /i e N ^ [MEnvj (resp. /x' e N -» (MEnvf) 
be the approximation chain in the definition of |CT] (resp. |CT'J'). For every 
m with mtype{m,Own) defined and ^mscope{m,Own) or prot{m,Own), the 
following implications hold for every i, where x = pars{m, Own) and T^T = 
mtype{m, Own). 
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(a) g MEnv /z^ ^ ^ Q {Own,x,T^T) ([M]/Xi) ([M'JVO 

if m has declaration M in CT{Own) and M' in CT'{Own) 

(b) 6? MiJnt; /Zi /xj ^ ^ {Own,x,T^T) ([M]/Xi) (res<r([MB]'At^, Own)) 

if TO has declaration M in CT{Own) and is inherited from S in CT'iOwn), 
with Ms the declaration of to in B 

(c) g MEnv iJLin'i ^ S {Own,x,T^T) {restr{lMBlfJ,i,Own)) (|M']X) 

if TO has declaration M' in CT' {Own) and is inherited from B in CT{Own), 
with Mb the declaration of to in _B 

Instead of Assumption 7.15 we need only the following. 

Assumption 10.11 CT and CT' are confined class tables for which a (generalized) 

simulation g is given. 

Theorem 10.12 (abstraction) g MEnv [CT] |CT']'. 

The proof is essentially the same as the proof of Theorem 7.20. The definition 
of g MEnv requires the relation to be preserved by those module-scoped methods 
that are called by subowners, and this is ensured by Def. 10.10(2) of simulation. 
The lemmas used in the proof are as follows. 

Lemma 10.13 (preservation by expressions) For any non-rcp class C ^ Own 
and any constituent expression F h e : T of a method declared in C, the follow- 
ing holds: For all a and all {h,r]) G iHeap^Tj and {h',r]') e iHeap^Tj', if 
g a {Heap (gi F) {h,r]) {h',ri') then 

g a (T^) (IF h e : T}{h, r?)) ([F h' e : TY{h', r?')) • 

Proof. The proof is very similar to the proof of Lemma 7.22 except in the case 
of field access. 

For F h e.f : T, the argument is as follows, for any a. By induction on e we have 
g (T Ci_ 1 1' . In the non-± case, nil ^ I' hence, by definition of g, al = £'. By 
closure of the heaps, i G dom h and £' G dom h' . 

We consider cases on whether C < Own. Consider confining partitions {Ch * 
Ohi * Rhi . . .) = h and {Ch' * Oh'i * Rh'^ . . .) — h' that have corresponding islands 
as in the definition of TZ Heap. In the case C < Own, we have £ G locs{Own[) and 
hence £ in some dom{Ohi). From g a Heap h h' we have 

Go {Oh,* hRepi) {Oh', * hRep'^) 

and thus £' G dom{Oh'j) by basic coupling Def. 10.2(1) and bijcctivity of a. Since 
C 7^ Own, we know by visibility that / is not in the private fields Ij of Own. Thus, 
as type{f, loctype £)) = T, we have g aT {h£f) {h'i'f) by Def. 10.2(3). 

In the case C ^ Own we have £ G dom{Ch) and hence £' G dom{Ch') hy a £ — £' 
and definition g Heap. Hence 

g a {state {loctype £)) {h£) {h'£') 

and thus g a T {h£f) {h'£'f) by definition of g {state {loctype £)) . Note that 
loctype £ = loctype £' because ct is a typed bijection. □ 
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Lemma 10.14 (preservation by commands) Suppose that /x and ji' are con- 
fined method environments and Q MEnv /x /x'. Then the following holds for any 

non-rcp class C ^ Own. For any constituent command F h 5 in a method decla- 
ration in CT{C), any a, and any {h, 77) e {Heap (g) F] and {h', rj') G {Heap ® F]', if 
conf C {h, 7]), conf C {h', r]'), and Q a {Heap F) {h, rj) {h', rj') then there is cro 2 cr 
such that 

g (70 {Heap ® r)i (IF h SMh,v)) (F 5|V(/i',r?')) • 

Proof. The proof is very similar to the proof of the corresponding Lemma 7.23 
except in the cases of method call, field update, and most interestingly new. We no 
longer have the assumption of parametricity of the allocator, and we must consider 
construction of reps in sub-owners. We also need an analog to Lemma 7.21, saying 
that constructors establishes TZ: 

Claim: For all a and all {h,£) G {Heap^Cj and {h'J') G {Heap^Cj, if 
g a Heap h h' and Q a C ££' then there is ao D a such that Q (Tq Heap ho Hq where 

ho = [self: C h constrC : con]/i(/i, [self 1— > P\) 
h'a = [self : C h' constrC: conjn'{h', [self ^ £']) 

We omit the proof of the claim, which has the same structure as the proof of 
Lemma 7.21. 

Case F I- x := e.m(e). This goes through as before except for the case where 
C < Own. In that case, the called method may have module scope and this is why 
such methods (designated by prof) arc included in the definition of Q MEnv. 

Case F \- ei.f := ei. By Lemma 10.13 for ei we have Q a C 1 1' , hence ai = i' 
definition of Q. By Lemma 10.13 for 62 we have Q a U d d' and hence Q a T d d' 
by Lemma 10.5. To conclude the argument it suffices to show 

g a Heap[h\l^[h£\ f^d\][h' \£'^[h'£' \ f^d']] . (*) 

Consider confining partitions {Ch* Oh\ * Rhi . . .) = h and ( Ch' * Oh'^ * Rh\ . . .) = h' 
that correspond as in the definition of ^ ct Heap h h' . We argue by cases on C. 

— C < Own: Then loctype £ < C < Own. By a £ — £' and Q a Heap h h' , there is i 
such that {1} = dom{Ohi) and {('} = dom{Oh'^) and 

Ga{Oh,*Rh,) {Oh[*Rh'^) . 

By typing and C =^ Own, field / is not in the private fields g of Own. So (*) 
follows from Q a Heap h h' and Q aT dd'. 

— C ^ Own: As C is non-rcp, we have £ G dom Ch and £' G dom Ch'. Moreover, 
G cr {state {loctype £)) {h£) {h' £') and so by ^/ ct T d d' wc get 

g a {state {loctype £)) [M \f^d] [h'i' \f<-^d'] . 

Hence (*). 

Case F I- a: := new B. By confinement of CT, this command is confined and 
hence the final states are confined: conf C {ho,r]o) and conf C {hQ,riQ). We have 
C ^ Rep and C ^ Own. Let £ = fresh{B,h) and £' = fresh{B,h'). Define 

ACM Journal Name, Vol. V, No. N, Month 20YY. 



68 • A. Banerjee and D. Naumann 

CTi = (7 U {(£, i')}. This makes ai bijective because £, I' are fresh and Q o Heap h h' 
implies, by definition, that doma C domh and rnga C domh'. 

By a r rj rj' and definition of ai we have G (Ti T r]o iIq. We proceed to show 

G (Ti Heap ho h'^, by cases on B. 

— B ^ Own A B ^ Rep: We have fields B = fields' B and thus 



So, as B is non-rep and B ^ Own, we can add £ to Ch and £' to Ch' to get 
partitions that witness Q (j\ Heap hi h'^. Now the induction hypothesis yields 
some (To 3 o"! such that G ctq -ffea;) ho h'o- We obtain ^ ctq F jyo ?7o from 
5 (7i r r/o ?7q because ctq 2 cti . 

— -B < Own: By basic coupling, Def. 10.2, wc get (Tq with GiTo'i2^2- Moreover, 
/i2 and /iJ, arc owner islands and the confining partitions for h, /?,' extend to ones 
for h*h2- and h' * h'2 with cto- Finally, by definition of Q we get Q ao Heap ho h'o 
as ho = h*h2 and h'Q = h' * h'2 . 

— B < Rep: Here, C < Own or C < Rep, as otherwise the command would not be 
confined. Let j be such that 77 self G dom{Ohj * Rhj). Add £ to Rhj and i' to 
Rh'j . This yields G ctq Heap ho h'o with ho = h * h2 and h'o = h' * h'2. □ 

11. STATIC ANALYSIS 

This section gives a syntax directed static analysis. It checks a property called 
safety. Safety is shown to imply confinement. 

The input is a well formed class tabic and designated class names Own and Rep. 
With one exception, only rep and owner code (including subclasses) is constrained. 
The exception is for new: a client cannot construct a new rep. For practical 
application, this can be ensured in a modular way: Rep and its subclasses would 
simply be declared with module scope. 

The analysis is given for the language extended in Sect. 9.3 with module-scoped 
methods. For the original language, mscope{m, C) can be taken to be false for all 
m and C . 

Definition 11.1 {safe) Class table CT is safe iff for every C and every m with 
mtype{m, C) = T^T the following hold. 

(1) If m is declared in C by T m(T x){S} then x : T, self: C, result :T \> S where > 
is the safety relation defined in the sequel. 

(2) self : C > constrC, for all C 

(3) If C < Own and -imscope{m, C) then T ^ Rep. 

(4) If m is inherited in Own from some B > Own then T ^ Rep. 

(5) No m is inherited in Rep from any B > Rep. 

The safety relation > is defined by the following rules. There is no restriction on 
field declarations per se. A client can have a Rep type field, but can assign only 
null to it. 

Safety for expressions 
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r > a; : Ta; T \> null : B T O it : unit T O true : bool T > false : bool 

C = (rself) r>e:C {f -.T) e dfieldsC 
C = Own A e ^ self T ^ Rep 

C < Own => Rep 

r>e./:T 

r[>ei:r r^eaiT T\>e:D B<D F [> e : D B <D 
r > d = 62 : bool r> (B) e:B T [> e is S : bool 



For expressions, the analysis imposes restrictions on field accesses and nothing 
else. If e.f appears in the body of an owner method, then a Rep can be accessed 
only via the private fields of Own; this requires e to be self. If e. f appears in 
a sub-owner, then the private fields of Own cannot be accessed, hence the result 
cannot be a Rep. 

For commands, the rules impose restrictions on new, field update, and method 
call. The conditions on field update are analogous to those for field access. For an 
object construction x := new B in the body of a client method, it cannot create a 
new rep. And, if it appears in a subclass of Rep, it cannot create a new owner as 
this would break confinement of the heap. 

For method call x := e.m(e), the condition labelled (a) says that if m is a client 
method called from a subclass of Own or Rep, then m cannot be passed reps as 
parameters. Conditions (b) and (c) consider method calls from an owner class or 
its subclasses: (&) says that if m's type is comparable to Own then reps can be 
passed as parameters only if e is self. Finally, (c) says that if m's type is comparable 
to Rep then no owner, other than itself, can be passed as parameter — otherwise 
confinement will be violated. 



Safety for commands 
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C=(rself) {f -.T) E dfieldsC 
X 7^ self B <Tx Object ToeitC V\>e2:U U <T 

C ^RepAC ^ Own ^ B ^ Rep C = Own A ei ^ self ^ U ^ Rep 

C < Rep ^ B ^ Own C < Own ^ U ^ Rep 

r > a; := new B r>ei./:=e2 

T\>e:D mtype{m, D) = T^T T<Vx 
Vt>e:U U <T x^ self 
C = (rself) m,scope{m., D) ^ C < Own V C < Rep 

(a) (C < Own VC < Rep) AD Rep A D_^ Own => T ^ Rep 

(b) C < Own ^ Own V (e = self) V T ^ Rep 

(c) C < Own ^ Rep V (Ve^ g e > 7^ self ^ ^ Own) 

r > a; := e.m{e) 

C = (Fself) mtype{m, superC) ~ T^T 
T[>e:U U <T x 7^ self T <T x 



r > a; := super.m(e) 
a; ^ self V>e:T T <T x T > T > ^2 



T>x:=e T>Si\ S2 

r>e:bool T^^i T>S2 V>e:U U <T {Y,x:T)>S 



r > if e then Si else ^2 fi T > T a; := e in 5 

Theorem 11.2 (soundness) If CT is safe then it is confined. 

Proof. Items (3)-(5) in the definition of safety are the same as items (3)-(5) in 
the definition of confinement for class tables. For items (1) and (2), the confinement 
of method and constructor bodies follows from safety thereof, by Lemmas 11.3, 11.5, 
and 11.6 to follow. □ 

Lemma 11.3 (argument values confined) Suppose T \- e:D and V \-e:U are 
confined. 

(1) If r > a; := e.mie) then F h a; := e.m{e) has confined arguments. 

(2) If F > X := super. m(e) then F h a; := super. m(e) has confined arguments. 

Proof. We give the argument for (1); the argument for (2) is similar (see Ap- 
pendix) . 

As in Dcf. 6.8, let C — (Fself). Assume conf n and conf C {h,ri). Let £ = 
|F \- e:Dl^i{h,r]), let d_= (T h e:Ulfi{h,r]), and let rji = [self x rf] . Finally, 
let i ^ nil, ± a.ndd^ ±. 

Because F O a; := e.m(e) holds we can use conditions (a)-(c) in the analysis rule 
for method call. Now the proof proceeds by cases on C with subcases on loctype I. 
In each case we show con/ {loctype £) {h,rfi). 

— C ^ Rep AC ^ Own: Because e and e are confined at C, we have £ ^ locs{Repl) 
and di ^ locs{Rep\.) for all di G d. Thus rngrji fl locs{Repi) = proving 
conf {loctype £) {h,r]i) by Def. 6.4(1). 
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— C < Own: Choose a confining partition and let j be such that ryself G dom{Ohj). 
Because e and e are confined at C, we have i G locs{Repi) => i G dom{Rhj) 
and di G locs{Repi) =^ di G dom{Rhj) for all di G d. Now we go by cases on 
loctype £: 

— loctype £ ^ Rep A loctype i ^ Own: Because loctype £ < D we have D ^ 
Rep A D ^ Own. Hence by condition (a) of the analysis, T ^ Rep. Thus 
rngrji n locs{Repl) = proving conf {loctype £) {h,r]i) by Def. 6.4(1). 

— loctype £ < Own: Hence £ G dom{ Ohk) for some k. Because loctype £ < D 
we have, D < Own V Own < D. If e = self then £ = (77 self) and k = j. 
Then rngiji Ci locs{Repi) = d H lacs {Rep i) C dom{Rhj) = dom{Rhk)- Thus 
conf {loctype £) {h,r]i) by Def. 6.4(2). If e 7^ self then by condition (b) of the 
analysis, T ^ Rep. Hence rng r]irilocs{Repl) = proving conf {loctype £) {h, rji) 
by Def. 6.4(2). 

— loctype £ < Rep: Hence £ G doni{Rhj) by confinement of e at C. As loctype £ < 
D we have, D < RepVRep < D. By Def. 6.4(3), to show conf {loctype £) {h, T]i), 
we must show rngrji fl locs{Own\,, Repi) C dom{Ohj * Rhj). For any di G 
locs{Owni), because loctype di < Ti, we have Tj < Own V Own < Ti. So by 
condition (c) of the analysis, Ci = self, hence di = (ryself) G dom{Ohj). For 
any di G locs{Repl) we have di G dom{Rhj) by confinement of e at C. 

— C < Rep: Choose a confining partition and let j be such that 77 self G dom{Rhj). 
Because e and e are confined at C, we have £ G locs{Owni, Repi) ^ £ G 
dom{Ohj * Rhj) and di G locs{Ownl, Repi) =^ di G dom{Ohj * Rhj) for all 
di G d. Now we go by cases on loctype £. 

— loctype £ ^ Rep A loctype £ ^ Own: Because loctype £ < D we have D ^ 
Rep A Z) ^ Own. Hence by condition (a) of the analysis, T ^ Rep. Thus 
rngi]i n locs{Repi) = proving conf {loctype £) {h,r]i) by Def. 6.4(1). 

— loctype £ < Own: Hence £ G dom{Ohj). Now rngrji r\locs{Repi) C dom{Rhj) 
as required for conf {loctype £) {h,rji) by Def. 6.4(2). 

— loctype£ < Rep: Hence £ G dom{Rhj). Now rngrji n locs{Owni, Repi) C 
dom{Ohj * Rhj) as required for conf {loctype £) {h, rji), by Def. 6.4(3). □ 

Lemma 11.4 (soundness for expressions) liT[>e:T then F h e : T is confined. 

Proof. Let C = (Fself). Now we go by induction on T \> e:T. Assume 
conf C {h, rf) and d= [F h e : T] (/i, ry) 7^ _L for each case of e. 

Case T\> e.f : T. Then d = h£f. We consider cases on C. 

— C ^ Rep A C ^ Own: We must show d ^ locs{Repi). Because loctype £ < C, we 
have £ ^ locs{Owni, Repi). So £ is in the client part of a confining partition and 
by Def. 6.2(1) we have d ^ locs{Repi). 

— C < Own: Consider a confining partition and j such that 7? self G dom{Ohj). 
We must show d G locs{Repi) ^ d G dom{Rhj). Assume d G locs{Repi). If 
C = Own, we have two subcases: If e = self we get £ = rj self, so i = j and 
d G dom{Rhj); if e ^ self then by the analysis we get T ^ Rep so d ^ locs{Repi), 
falsifying the antecedent. This concludes the case C = Own. If C < Own then 
by the analysis T ^ Rep so d ^ locs{Repi), falsifying the antecedent. 
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— C < Rep: Consider a confining partition and j such tliat 77 self G dom{Rhj). We 
must show d G locs{Owni, Repl) =^ d G dom{Ohj*Rhj). Since loctype i < C , we 
have £ £ locs{Rep[) and by induction on e we get t G dom{Rhj) via Def. 6.6(3). 
Because h is confined we get d G dom{Ohj * Rhj) by Def. 6.2(4). 

The remaining cases are similar; see Appendix. □ 

Lemma 11.5 (soundness for constructors) Suppose that self : Co consirC for 
all C and let jj, be arbitrary. Then the constructor semantics is confined in the 
following sense: For all {h, if) with con/ C {h, rf) we have conf ho and h < ho where 
ho = [self : C [> constrC : conjiJ.{h, rj) ^ _L. 

Proof. By well founded induction on C using the order -C in an argument 
similar to that for Lemma 7.21. See Appendix. □ 

Lemma 11.6 (soundness for commands) If F > 5 then F h 5 is confined. 

Proof. Let C — (Fself). Now we go by induction on F > S' and by cases 
on C. Assume conf C {h,ri) and conf ^ and |F h S']/z(/i, ry) 7^ _L. Let (/io,%) = 
[F h 5]/i(/i, T]). In each case we must show ho is confined and conf C {ho, %)• 

Case F > ei./ := 62. Here rjo = r] and ho = [h | i 1-^ [hi \ / 1— > rf]]. Because 
F > ei : C and F [> 62 : (7, by Lemma 11.4, ei and 62 are confined at C. We must 
first show that ho is confined and then show conf C {ho,r]o). By conf C {h,r]) we 

know there is a confining partition h = Ch * We partition ho using the given 

partition for h. That is, the domain for each block, say C/i", is the same as the 
corresponding block for h, say Ch. We claim this partition is confining for ho- It 
then follows by Def. 6.3 that h < ho- Then by Lemma 6.13, we get conf C {ho,r]), 
hence conf C {ho, ryo). It remains to show the claim for which we need to show the 
conditions in Def. 6.2. We go by cases on C. 

— C ^ Rep AC ^ Own: Only condition (1) in Def. 6.2 can possibly be violated. 
By conf C {h, 77) we obtain rng r] fl locs{Repi) = 0. Because loctype £ < C we 
have £ G dom{Ch°). By confinement of 62, d ^ locs{Repi). Hence Ch^ •/» 
for all j- 

— C < Own: Let 77 self G dom{Oh^) for some i- Only conditions (2) and (3) in 
Def. 6.2 can possibly be violated. Because loctype £ < C, £ G dom{Oh'^) for 
some j. Because 62 is confined at C we have, d G locs{Repl) d G dom{Rh'^). 
We consider the case C = Own and e = self. Then £ = 77 self and i = j, 
establishing condition (2). By typing, f G g- Hence Ohi 9^^ establishing 
condition (3). In the case e ^ self, by the analysis we have U ^ Rep thus 
establishing conditions (2) and (3). 

Now we consider the case C < Own- By the analysis we have U ^ Rep thus 

establishing conditions (2) and (3). 
— C < Rep: Let 77 self G dom{Rh^^) for some i. Only condition (4) in Def. 6.2 can 
possibly be violated. Because loctype £ < C, £ & locs{Repi)- By confinement 
of ei at C, we have £ G dom{Rh'^). And, by confinement of 62 at C, we have 
d G locs{Ownl, Repl) ^ c? G dom{Oh'^ * Rh^i )- This estabhshes condition (4). 

Case F > a; := e.m(e). Here ho = hi and 770 = [rj \ x^di]- Because T \> e:D 
and T \>e:U, by Lemma 11.4, e and e are confined at C. By the analysis and 
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Lemma 11.3 we have conf {loctype i) {h,ri\). Then by assumption conf jjL we get 
conf {loctypet) (/io,??i)- Hence ho is confined. To show conf C (/io,%), we go by 
cases on C: 

— C ^ RepAC ^ Own: By conf /U, di ^ locs{Repi). Hence rng rior\locs{Repi) = 
by conf C{h,m). 

— C < Own: Let r/self G dom{Ohj) for some j in the confining partition of h. By 

conf fi, di ^ locs(Repl) and /i < /iq- Because a; ^ self, wc have 770 self = r^self. 
Hence rngrjo H locs{Repi) = rngrj fl locs{Rep[) C dom{Rhj) by conf C {h,r]). 
As h < ho, dom{Rhj) C dom{Rho.). That is, rngrjo r\locs{Repl) C dom{Rho.). 
— C < i?e]3: Because loctypet < C, let £ G dom{Rhj) for some 7 in the confining 
partition of /i. By conf jj,, di € locs{Ownl, Repl) di £ dom^Oho^ ^RHq.) and 
/i < ho- Hence rngrio (1 locs{Ownl, Repi) C dom{Ohoj * -R/iOj) by conf C {h,r]) 
and Def. 6.3. 

Case T l> x := new S. First, we claim conf B {hi,rji) and ft, < /ii. Then by 
Lemma 11.5 we get conf B {ho,r]i) and hi < ho- So h < ho and by Lemma 6.13 
conf C {ho, rf)- To conclude, we argue that conf C {ho, [r] \ x>-^i]) by cases on C. 

— C ^ Own A C ^ Rep: then S ^ i?ep so £ ^ locs{Repl) by typing and hence 
conf C {ho, [?7 I xi-^^]). 

— C < Own: Let h = Ch* {Oh\ * . . . (0/ife * Rhk) be a confining partition of h, 
and j such that self be in dom{Ohj). If S ^ Rep then con/ C {ho, [rj \ xi-^£]) by 
definition. If B < Rep then we must show £ G dom{Rh^) where ho has confining 
extension ho = Ch'^ * {Ohi * Rh\) . . .. This is defined just as in the proof of 
Lemma 6.16, and we choose to put £ and the objects it constructs in Rhj to 
obtain i2/i°. 

— C < Rep: By the static analysis, B ^ Own. So = Ohj- Thus rng [rj \ 
I] n locs{Own[, Repi) C dom{Ohj * Wc choose to put £ and the objects it 

constructs in Rhj to obtain which makes the inclusion hold. 

It remains to prove the claims conf B {hi,rii) and h < hi- In the semantic def- 
inition, hi = [h \ £ [fieldsB 1-^ defaults]] where £ = fresh{B,h). Define 
Bh = [£ ^ [fields B ^ defaults]] so hi = h * Bh. Let rn = [self 1-^ £]. Next, 
we argue that h < hi and conf B (/ii, 771). Because h is closed, £ is not in the range 
of any object state in h- To construct an extending partition it suffices to deal with 
the new object, as its addition cannot violate confinement of existing objects. We 
define the extension and argue by cases on B. 

— B ^ Own A B ^ Rep. For a confining partition of hi we extend that for h 
by defining C/i° = Ch * Bh and using the given partition of owner islands. 
Because defaults contains no locations, this is a confining partition and we have 

conf B{hi,rii). 

— B < Own. We extend the partition by adding an island Oh^j^i * -R/ife+i with 
Ohl^^ = Bh and i?ft°+i = 0. This is a confining partition because defaults has 
no locations and wc have conf B {hi, rji) because rngrji has no reps. 

— B < Rep. Then, by the analysis we have C < Own or C < Rep; moreover as 
X ^ self, we have r] self ^ £, so r] self G dom{Ohj * Rhj) for some j. Then we can 
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obtain a confining extension by adding Bh to Rhj, as defaults has no locations. 
As rng rn = {£}, we have conf B {hi,r]i) by definition. 

This concludes the argument for h < hi and conf B {h\,r]\). 

The remaining cases are similar and can be found in the appendix. □ 

12. DISCUSSION AND RELATED WORK 

Programmers draw pictures of pointers in heap-based data structures and often 

manage to get things right as far as the presence of pointers goes. For example, lists 
don't get disconnected. The absence of pointers is harder to picture and many bugs 
are due to unexpected aliasing. Expectations are raised through use of encapsula- 
tion constructs such as private fields and modules, but heap structure is not entirely 
manifested in language constructs. Simulation relations are often used for reasoning 
about abstractions and here too aliasing presents a challenge: Multiple instances 
of an abstraction may reference a shared client object or be shared by multiple 
clients — but client references to representation objects can violate encapsulation. 
Various notions of ownership confinement have been proposed for encapsulation of 
objects. We have formalized one and shown that clients are independent from con- 
fined representations. Independence is formalized by an abstraction theorem that 
licenses reasoning about equivalence of class implementations using simulation re- 
lations. Confinement is formalized by drawing boundaries that signify the absence 
of pointers. 

12.1 Related work 

Representation independence. The main proof technique for representation in- 
dependence is so fundamental that it has appeared in many places, with a vari- 
ety of names, e.g., simulation, logical relations, abstraction mappings, relational 
parametricity (e.g., [Plotkin 1973; Reynolds 1984; Lynch and Vaandrager 1995; 
de Roever and Engelhardt 1998]). Among the many uses of simulations are pro- 
gram transformations and justification of logics for reasoning about data abstraction 
and modification of encapsulated state. 



Representation independence results are known for general transition systems [Mil- 
ner 1971; Lynch and Vaandrager 1995], first order imperative languages [He et al. 
1986; de Roever and Engelhardt 1998], higher order functional [Reynolds 1984; 
Mitchell 1986; 1991; 1996; Power and Robinson 2000] and higher order imperative 
languages [O'Hearn and Tennent 1995; Naumann 2002], and sequential object- 
oriented programs without heap allocation ([Cavalcanti and Naumann 2002] treats 
a language with class-based visibility and [Reddy 1998] treats one with instance- 
based visibility). As far as we know, our results are the first for shared references 
to mutable state, a ubiquitous feature in object-oriented and imperative programs. 
(The lacuna is mentioned in [Grossman ct al. 2000].) 

A widely held view seems to be that classical techniques based on denotational 
semantics and logical relations are inadequate in the face of the complex language 
features of interest. The combination of local state with higher order procedures 
makes it difficult to prove representation independence even for Algol, where pro- 
cedures can be passed as arguments but not assigned to state variables [O'Hearn 
and Tennent 1995]. Objects exhibit similar features. 
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Difficulties with denotational semantics led to considerable advances using small- 
step operational semantics [Gordon and Pitts 1998]. However, to get an adequate 
induction hypothesis for an abstraction theorem, paramctricity needs to be imposed 
on the latent effects of procedure abstractions, either as a property to be proved or 
as an intrinsic feature of the semantic model [Reynolds 1981b; O'Hearn and Tennent 
1995]. These conditions are most easily expressed in terms of a denotational model, 
but if procedures can be stored in the heap on which they act, difficult domain 
equations must be solved. Recursive data types also lead to nontrivial domain 
equations. Even if solutions can be found, they may be quite complex structures 
that are difficult to understand and work with. 

One of the most relevant works using operational semantics is that of Grossman 
et al. [2000] where representation independence is approached using a dynamic no- 
tion of ownership by principals as in the security literature. To prove that clients 
are independent from the representation of an abstraction provided by a host pro- 
gram, a wrapper construct is used to tag code fragments with their owner (e.g., 
client or "host"), and to provide an opaque type for the chent's view of the abstrac- 
tion. This is a promising approach, but the results so far only show "independence 
of evaluation" , which is analogous to the special case of simulation used for non- 
interference in analysis of information flow [Volpano et al. 1996; Abadi et al. 1999]. 
Although Grossman et al. [2000] offer their work as a simpler alternative to domain 
theoretic semantics, the technical treatment is somewhat intricate by the time the 
language is extended to include references, recursive and polymorphic types. 

Except for parametric polymorphism, we treat all these features, as well as others 
such as subclassing, dynamic binding, type tests and casts. Although Java syntax 
seems less elegant than, say, lambda calculus, it has several features that ease 
the difficulties. Owing to name-based type equivalence and subtyping, and the 
binding of methods to objects via their class, wc can use a denotational model with 
quite simple domains and fixpoint definitions in the manner of Strachey [2000] (cf. 
Sect. 3.1). 

For applications in security and automated static checking, it is important to de- 
vise robust, comprehensible models that support not only the idealized languages of 
research studies but also the full languages used in practice. Denotational seman- 
tics has conceptual advantages, at least if the domains are simple enough to have 
a clear operational significance. However, we admit that our enthusiasm for the 
efficacy of denotational techniques has been tempered by the irritation of flushing 
out bugs in intricate definitions and induction hypotheses. 

Our abstraction theorem and identity extension lemma can be used directly to 
prove equivalence of programs, where a program is a command in the context of 
a class table and designated class C. It would be reasonable to use a notion of 
equivalence based on field visibility: states would be equated if they are equal after 
hiding all fields except those visible in C. But this would beg the question whether 
hiding imposes encapsulation that is not intrinsic to the language. In this paper we 
use the finer equivalence on programs: for commands to be equivalent they must 



•^■^ Recently Levy [2002] used functor categories to give a denotational model for a higher order 
language with pointers, but the model does not capture relational parametricity and the language 
has neither object-oriented features nor recursive types. 
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yield outcomes that are identical after garbage collection. Thus encapsulation is 
formulated in terms of private fields and confined reps but the identity extension 
lemma is expressed, in effect, in terms of local variable blocks (in the style of, e.g., 
He et al. [1986]). 

Besides the "client interface" provided by public methods and analogous to the 

interfaces studied in previous work on representation independence, a class also has 
a "protected" interface to its subclasses. The combination of protected and public 
interfaces is complicated, but a thorough treatment of representation independence 
for object-oriented programs must take it into account. For reasoning about the 
protected interface, work on behavioral subclassing has used simulations to connect 
a class with its subclass [Liskov and Wing 1994; Leavens and Dhara 2000] but 
a formal connection has not been made with the use of simulations to connect 
alternative representations. 

Confinement. Quite a few confinement disciplines have been proposed, by Hogg 
[1991], Almeida [1997],Vitek and Bokowski [2001], Clarke et al. [2001], Miiller and 
Poetzsch-Heffter [2000], Boyland [2001], Lea [2000], Aldrich et al. [2002], and Clarke 
[2001] (the latter has a more comprehensive recent survey). Most proposals have sig- 
nificant shortcomings; they disallow important design patterns or are not efficiently 
checkable. Although the aim is to achieve encapsulation and thereby support mod- 
ular reasoning in one form or another, few proposals have been formally justified 
in these terms — none in terms of representation independence. 

Several works justify a syntactic discipline by proving that it ensures a confine- 
ment invariant [Miiller and Poetzsch-Heffter 2000; Clarke 2001; Aldrich et al. 2002]. 
Others go further and show some form of modular reasoning principle, as we discuss 
in detail below. Existing justifications involve disparate techniques and objectives, 
so that it is quite hard to assess and compare confinement disciplines. One of our 
contributions is to show how standard semantic techniques can be used for such 
assessments. 

The fact that type names are semantically relevant lets us use them to formu- 
late in semantic terms a condition similar to the ownership confinement notions 
of Miiller [2002], Clarke et al. [2001] and their predecessors [Hogg 1991; Almeida 
1997]. Whereas several papers emphasize reachability via paths, our formulation 
of confinement emphasizes partitioning of heap objects and the one-step points-to 
relation. In this we were inspired by the work of Reynolds [2001] that shows the 
efficacy of reasoning about partition blocks that may have dangling pointers. 

Reasoning on the assumption of confinement is a separate concern from enforce- 
ment or checking of confinement. Semantic considerations led us to a flexible, 
syntax-directed static analysis, but other analysis techniques such as model check- 
ing or theorem proving for (an approximation of) the semantic confinement property 
could be interesting. 

It is interesting to note that we get a strong reasoning principle on the basis 
of ownership confinement alone, in a form that can be checked without program 
annotations. By contrast, other works use annotations and combine ownership with 
uniqueness and effects (e.g., read-only) [Clarke and Drossopoulou 2002; Aldrich 
et al. 2002; Miiller 2002]. 

Confinement figures heavily in the verification logics of Miiller and Poetzsch- 
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Heffter [2000] and in some work by the group of Nelson and Leino [Leino and Nelson 
2002; Detlefs et al. 1998] where it is needed for sound reasoning about the "modifies 
clause" framing the scope of effects. Subscqiicnt to the present work. Clarke and 
Drossopoulou [2002] state results on reasoning about effects, using a confinement 
discipline imposed using code annotations for confinement and effects. These works 
are concerned with delimiting the scope of effects, which is an important aspect of 
modular reasoning, but they do not address representation independence. 

There has been much work on capturing encapsulation via visibility (lexical 
scope), using existential types and subsumption (see [Bruce et al. 1999; Bruce 2002; 
Pierce 2002] and references therein). None of these works addresses the problem of 
confinement; they are concerned with the complex typing issues for object oriented 
languages. 

It is interesting to note that one of the main difficulties in designing safe and 
flexible type systems is due to the desire to eliminate or minimize the use of type 
testing and casting which are seen as loopholes that subvert type-based encapsula- 
tion. Indeed, parametric polymorphism has been much pursued as a means to cope 
with generic patterns that, in current practice, are usually coded using subsump- 
tion, casts, and type Object (a recent reference is the textbook by Bruce [2002]). 
Although parametric polymorphism has obvious merit, our results show that casts 
and type tests are themselves relationally parametric. It is behavioral subclassing 
which is at risk in some uses of casts and tests. This does not contradict [Reynolds 
1984] because our language has a nominal type system [Pierce 2002]; it is the name 
of a type, not its set of values, that is involved with tests and casts. 

Our aim is to deal with the rich languages currently in use, rather than to advance 
language design. It is challenging to formalize the syntax precisely yet perspicu- 
ously. Rather than devising our own idiosyncratic formalization, we adapted that 
of Igarashi et al. [2001]. The details differ, as our language includes imperative 
constructs and non-public scoping and their main concern is type soundness. 

12.2 Future challenges 

The language for which our results are given encompasses many important fea- 
tures of object oriented languages. Two major features are missing and will require 
substantial additional work: concurrency and parametric polymorphism. The in- 
teraction between parametric and subtyping polymorphism is non-trivial and there 
are a number of competing type systems. Some languages, e.g., C-|— 1-, have para- 
metric polymorphism but with significant limitations; for Java, parametric types 
are a late addition. We expect to extend our work to them in the future. 

Ownership confinement is appropriate for reasoning about many designs in prac- 
tice and we have shown through a series of examples that our notion is applicable 
to widely used designs such as the observer and factory patterns. Two important 
issues are beyond the reach of our work (and much of the previous work on con- 
finement). The first is multiple ownership. A canonical example is a collection 
class with iterators. The reps for the collection are nodes of a data structure. The 
collection object mediates additions and deletions. To allow enumeration of ele- 
ments of the collection it is common to use iterator objects which need access to 
the nodes of the data structure. Static analyses have been given that allow some 
form of multiple owners [Clarke 2001; Miiller 2002; Aldrich et al. 2002]. Although 
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our formalization of islands can be extended easily to encompass multiple owners, 
it is not as clear how to extend the notion of simulation in a useful way. Our result 
formalizes the notion that an owner instance provides an abstraction and this is 
easily expressed in terms of the class construct. The generalization can perhaps be 
expressed by grouping the related owners (e.g., the collection class and the iterator 
class) in a module, but this is left for future work. 

The other challenging issue for confinement is ownership transfer. Consider a 
queue that owns objects representing tasks to be performed. For load balancing, 
tasks may be moved from one queue to another. In this case a task is owned by 
just one queue at a time and in a given state the system is confined according 
to the definition in this paper. A sequential program for transferring ownership 
from one queue might look as follows: q2.task := ql.task; ql.task := null. From a 
confined initial state this need not lead to a confined final state: there could be other 
references to task. But it does lead to a confined final state if q2.task is initially 
the only existing reference to the task. Unique references have been extensively 
studied so let us assume that a static analysis is given for uniqueness. Even with 
uniqueness, our theory fails to apply, for two reasons. The first reason is a small 
one: in the intermediate state two different owners reference the same task. This 
problem is well known and can be surmounted: It is easy to add to our language an 
atomic command with the effect of the above sequence [Minsky 1996] and to show, 
given uniqueness, that it is confined. For practical purposes one would use a static 
analysis to check that ql.task is a dead expression [Boyland 2001]. 

The second reason our theory does not apply is a technical one. To show that a 
method call is confined, we need that the caller's environment is confined in the final 
heap assuming it was confined in the initial one. We get this by using a condition 
stronger than confinement: from a confined state, a command or method yields 
a final heap that extends the initial one in the sense of Def. 6.3. All commands 
of our language yield heaps extended in this sense so all method meanings have 
this property. (See the proof of Theorem 6.17.) But, by definition of extension, 
h < ho says that reps that exist in h have the same owners in ho as in h, disallowing 
ownership transfer. 

For static analysis there are some more modest issues worthy of investigation. 

The simple conditions of Def. 6.9 ensure suitable confinement of the class table 
but they arc unnecessarily strong. Methods inherited into rep classes are not risky 
if they do not leak self; such "anonymous methods" can be statically checked as 
shown by Vitck and Bokowski [2001] and Grothoff ct al. [2001] in work on module- 
based confinement.^^ The conditions of our static analysis may also admit useful 
variations. 

Having shown that simulation is sound one might proceed to study completeness. 
It is not the case that our confinement conditions are necessary in general for 
simulations to be preserved. A trivial simulation might depend on no confinement 
at all. Also, a rep could Ik^ leaked but not exploited by any client. One can see 
confinement as a kind of simulation which happens to be a rectangular predicate: 
h relates to h' just if h and h' are confined, independent of each other. This 



^In fact the cited work is concerned with pragmatic aspects of the analysis and does not formalize 
a semantic property ensured by the analysis. 
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suggests folding the confinement condition into the simulation relation, an idea 
which is currently under study by Uday Reddy and Hongseok Yang for a Pascal-like 
language.^'* For practical reasoning the benefits of treating confinement separately 
are clear: it accords with informal design practice, is amenable to static checking, 
and ensures soundness for a straightforward and modular notion of coupling. 

The more practical question is how to express basic couplings and prove the 
simulation property for owner methods. To formalize the couplings for the ob- 
server examples one needs a formalism for inductive predicates on recursive data 
structures; separation logic appears promising for this purpose [Reynolds 2002]. 

As we discussed in conjunction with Example 8.6, representation independence li- 
censes reasoning about equivalence of programs that are structurally similar [Baner- 
jee et al. 2001; Riecke 1993]. This is quite adequate for uses of simulations such as 
static analyses and relating alternative interpretations for primitives, such as the 
lazy and eager access control implementations for Java [Banerjee and Naumann 
2002a]. But for abstraction in program development, typically called data refine- 
ment, it is not uncommon to consider significantly different program structures and 
this calls for a full program logic in which something like the abstraction theorem 
appears as a proof rule. For first-order imperative languages, several proof sys- 
tems have been given for reasoning about two versions of an abstraction [de Roever 
and Engelhardt 1998] . Typically, relations (especially "abstraction functions" ) are 
used to derive from one version the specification of the other version, which is then 
proved correct in a program logic. Logics for imperative object-oriented languages 
are at an early stage of development [Abadi and Leino 1997; Cavalcanti and Nau- 
mann 1999; Poetzsch-Heffter and Miiller 1999; Huisman and Jacobs 2000; Huisman 
2002; Reynolds 2002]. 

APPENDIX 

A. ADDITIONAL PROOFS 
Proof of Lemma 6.12 

By cases on C and B. It suffices to consider C < B and to deal with confinement 
of T] in h. 

— C < Rep. Then the hypothesis of the Lemma is falsified because r?self e 

locs{Rep[) . 

— C ^ Own AC ^ Rep. Then B ^ Own AB ^ Rep, so conf Crjh conf B {h, rf) 
because both C and B are subject to condition (1) in Definition 6.4. 

— C < B < Own. Again, both B and C are subject to the same condition, here 
(2) in Definition 6.4. 



^*Their aim is to explicate the semantic structure of languages involving heap storage Their 
approach should lead to a lucid account on par with parametricity models for other lan- 
guages [Reynolds 1984; 1981b; Reddy 1998]. They have defined a parametricity semantics for 
a Pascal-like language [Reddy and Yang 2002] in which heap cells are tuples of pointers and in- 
tegers rather than objects with scoped fields. Several challenges remain to be addresssed, if this 
approach is to provide a foundation for reasoning about instance-based abstractions in Java-like 
languages using a practical confinement discipline. For example, nominal types and class-based 
visibility (which is not modelled by naive use of existential types). 
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— C < Own < B. We have con} B {h, rj) conf C {h, ij) by implication between 
the consequents of (1) and (2) in Definition 6.4. The converse holds owing to 
hypothesis rng r] fl locs{Repl) = 0. 

Proof of Lemma 6.13 

By cases on C. In the case C ^ Own A C ^ Rep, we have conf C {h, rf) <^=> 
conf C {ho,ri) because Definition 6.4(1) of conf C is independent of the heap. For 
the cases C < Own and C < Rep, we show conf C {ho, rj) using h < ho- First, by 
definition of < we have conf Hq. To show that rj is confined in ho for C, suppose 

h= Ch* Ohi * Rhi * ... * Ohk * Rhk 

is a confining partition of h. Let j be such that rngrj n locs{Ownl, Repl) C 
dom{Ohj * Rhj). Suppose, hy h < Hq, that this partition is extended by confining 
partition ho = Ch^ * Oh\ * Rh\ * . . .. In the case C < Own, we have rngrj fl 
locs{Repi) C dom{Rhj) C dom{Rh^), using conf C {h,rj) and the definition <. 
The case C < Rep is similar. 

Additional cases for Lemma 6.16 

Case T \- x := e. Here the heap is unchanged: ho = h and the result holds by 
reflexivity of <. 

Case T \- x := super. m(e). The same argument as for method call e.m. 

Case F h Si; S2- Let = |F h Si]fi{h,r]). By induction on we have 

h < hi. By confinement of Si we have conf C (hi, rji). So we can use induction on 
^2 to obtain hi < Hq and then ft, < /lo by transitivity of <. 

Case F I- if e then Si else 52 fi. By induction on 5*1 and ^2, using confinement 
of and ^2. 

Case F I- T a; := e in 5. Let 771 = [?? | .x [F h e : Uj(h, rj)]. By conf C [h, if) and 
confinement of e wc have conf C {h, 771). Then by induction on S, using confinement 

of S, wc get h < ho. 

Proof of Lemma 7.3 

By induction on depth. If C ^ Own then the equality is direct from Defini- 
tion 7.1(1). If C < Own then it is possible that CTiOwn) declares m but 
CT'{Own) does not (or vice versa). But in that case, by Definition 7.1(3) we 
have mtype{m, C) = mtype! (m, C) so m must be declared in a superclass, whence 
depth{m, C) = 1 + depth{m, superC) = 1 + depth'{m, superC) = depth' {m, C). 

Proof of Lemma 7.19 

Let Fs = (S:T,self:S) and Fc = (x:T,self :C). To show 

7^ {B, x,T^T) {restr{d, B)) {restr{d', B)) (*) 

consider (ft, 77) G {Heap (EiF^] and {h',r]') e fHeap ® F^]' such that conf B {h,ri), 
conf B {h' ,rj'), and TZ {Heap ® F^) (ft,??) {h',rj'). By definition of restr we have 
restr{d,B){h,r]) = d{h,rj) and restr{d' , B){h' ,ri') = d'{h',r]'). So for (*) it remains 
to show 
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n {Heap ® T)^ {d{h, 77)) {d'{h', 77')) (t) 

By Lemma 5.7(1) we have (/i, 77) G [Heap(E)Tcj and {h',r]') G {Heap 'E)Tcf. By 
hypothesis, C is non-rep so B is also non-rep. As T is non-rep, we have rngrj Ci 
locs{Rep[) = and rngrj' H locs{Rep' [) = 0. Thus Lemma 6.12 is appUcablc to 
ry, 77' and using hypothesis B < C we obtain conf C {h, ij), and conf C {h' , rj'). Thus 
we have estabUshed the antecedents needed to use hypothesis TZ (C, x,T ^ T) d d' 
to obtain (f). 

Proof of Lemma 7.22 

Case F h x:T. Then 7^ Tj_ (77a;) (?7'a;) by 7^ T 77 rj', so the result follows by 

semantics of x : T. 

Case F h null : B. Then semantics is nil and TZ B± nil nil by definition of TZ. 

Case F h it : unit. Similar to null, as are the cases true and false. 

Case F h ei = 62 : bool. Then, using identifiers from the semantic definition as 
usual, we consider cases on di. If di = _L then d'l — J- — di by induction on ei 
and definition of TZT. Hence, by semantics of ei = 62, [F h ei = 62 : bool](/i, 77) = 
|r h ei = 62 : bool]'(/i', 77') and thus 

7^bool_L (IF h ei = 62 : bool] (/i, 77)) ([F h 61 = 62 :bool]'(/i',77')) (*) 

The argument is symmetric for d2 — .L. 

If none of di,d[,d2,d'2 are _L then, by induction on ei we have TZ (Ti)_l di d'^. 
Thus, by Lemma 7.12, di = d[. Similarly, ^2 = ^2- Hence di = ^2 iff d[ = d'2, 
whence the result (*) holds by semantics. 

Additional cases for Lemma 7.23 

Case F I- x super. 771(e). 

By 7^ F 77 77' we have TZ C £ £' , hence £ = £' hy Lemma 7.12. By conf C {h, 77) and 
conf C {h', rj') we have £ ^ locs{Repl) and £ ^ locs{Rep'[). Let 771 = [self 1— > ^, x 1— > 
d\ and 77^ = [self £,'x ^ d]. By confinement of x :— super. 77^(e) (Definition 6.7) 
we have confined arguments, i.e., conf (superC) (h, rji) and conf (superC) {h' , ri'i) 

By Lemma 7.22 for e, and considering the non-_L case, we have TZU d d , whence, 
by Lemma 7.13^7^ Td'^. Prom TZC ££' we get TZ {superC) £ £' by Lemma 7.13, 
and thus TZ \x:T, this : superC] 771 77^ . Prom TZ MEnv jj, fj,' we get 

TZ {superC, mtype{m, superC)) {^{superC)m) {n'{superC)m) 

hence, as h, h', 771, 77^ are confined and related, TZ {Heap (g) T) {hi,di) {h[,d[) where 
{hi,di) = iJ,{superC)m{h,r]) and {h'i,d[) = fi' {super C)m{h' ,ri'). Thus TZT di d[ 
and TZ Heap hi h[. It remains to show that the updated stores [r/ \ x 1-^ di] 
and [77' I a; ^ d'^] are related. This follows from TZ T di d'^ and T <T x using 
Lemma 7.13. 

Case Fh Si; S2. 

As usual, we consider the non-_L case. By induction on S\ we have TZ {Heap ® 
r)_L {hi,r]i) {h[,r]'i). Moreover, as Si is a constituent of a method in CT and CT', 
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by confinement of we liave con} C {hi,r]i) and conf C {h{,r][), so we can use 
induction on ^2 to obtain the result. 

Case r h if e then Si else S2 fi- Similar to case of sequence, but also using 
Lemma 7.22 for e. 

Case r h T x := e in S. 

By Lemma 7.22 for e we have TZ Uj_ d d'. If d = ±, then d' = ± and both 

semantics yield ±. Otherwise, we have TZ T d d' hy the corollary to Lemma 7.12. 
Thus, from TZ T rj t]' we obtain TZ {T,x: T) r}\ r][ where rji = [rj \ x d] and 
r][ = [rj' \ X 1-^ d'] as in the semantic definition. In order to use induction on S, we 
need to show conf C {h, rji) and conf C (ft.', jfi). From condition (1) in Definition 6.9 
of confinement for CT, e is confined. In the case C ^ Own, confinement of e yields 
d ^ locs{Rep[). and thus conf C {h,rji). In the case C < Own, confinement of 
e yields d G locs{Rep[) ^ d G dom,{Rhj) for some partition and j with r/self G 
dom{Ohj). This is the condition required for conf C (ft., 771) in this case. Similarly, 
we get conf C {h' , rj'i) . Now, by induction on S we get that both semantics are _L 
or else the result states from S satisfy TZ {Heap ® F, a; : r)^ (fti, ??2) (/I'u ??2)- the 
latter case, TZ [Heap ® F) (fti, (772 [ x)) {h[, (rj^ t x)) as required. 

Proof of Lemma 11.3(2) 

Again the proof proceeds by cases on C. In each case we show conf {super C) (ft, r/i), 
noting that £ = (77 self). 

— C ^ Rep A C ^ Own: By confinement of 77 at C, we have i ^ locs{Repl). 
Because e is confined at C, wc have di ^ locs{Rep\,) for all di G d. Thus 
rngrji r]locs{Repl) = 0. And, since C < superC, we have conf {superC) (ft, 771) 
by Lemma 6.12 and Definition 6.4(1). 

— C < Own: Choose a confining partition and let j be such that £ = (77 self) G 
dom{Ohj). Since C < superC we have superC < Own {Own < superC 
is impossible by definition of superC). Because e is confined at C, we have 
di G locs{Repl) ^ di £ dom{Rhj) for all di G d. Thus rngrii nlocs{Repl) C 
dom{Rhj) proving conf {superC) (ft, 771) by Definition 6.4(2). 

— C < Rep: Choose a confining partition and let j be such that i = (7; self) G 
dom{Rhj). Since C < superC wc have superC < Rep {Rep < superC is im- 
possible by definition of superC). Because e is confined at C, we have di € 
locs{Ownl, Repl) ^ di G dom{Ohj * Rhj) for all di G d. Thus rngrn Ci 
locs{Owni, Repl) C dom{Ohj * Rhj) proving conf {superC) {h,r]i) by Defini- 
tion 6.4(3). 

Additional cases for Lemma 11.4 

Case T t> x:T x. Then d = rjx. Confinement of x follows because the conditions 
for d are exactly the same as the conditions for 77 and 77 is confined. 

Cases r[> null :S, r>true:bool, r>false : bool, r>it:unit, r>e is B:bool. 
For null the result holds since nil ^ Loc and for true, false, it, e is B the result 

holds by Lemma 6.11. 

Case T > {B)e : B. Then d = £ and the result follows by induction on e for each 
subcase of C. 
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Proof of Lemma 11.5 

First we show that hi is confined, where we have the following cases on super C: 

— superC = Object: then hi = h. So conf h by hypothesis and h < hi hy 
reflexivity of <. 

— superC < Object: as superC <C C, wc can appeal to induction for superC to 
obtain conf hi and h ^ hi. Now by Lemma 6.13 wc have conf C (hi, t]). 

It remains to show conf ho and h < ho- This is a consequence of a more general 
Claim: For the given C, suppose self : C h 5 is a command with no method calls 

and self: C l> S. Moreover, suppose that for any new B that occurs in S we have 

BrC. Then self : C \- S is confined. 
Applying the claim to constrC, we get conf ho- Then Lemma 6.14 applies, to 

yield hi ^ ho. So finally h < ho hy transitivity. 

The proof of the claim is by structural induction on S. The argument is the same 

as the proof of Lemma 11.6, except that in the case of new that proof appeals to 

Lemma 11.5 whereas here wc appeal to the induction hypothesis. This use of 

induction is sound because for any new B in the constructor, B \Z C and hence 

B^C. 

Additional cases for Lemma 11.6 

Case Tt>x := e. Here ho = h, hence confinement of ho follows because conf C {h, rf). 
To show conf C {ho, rjo), wc go by cases on C. First, as F [> e : T, by Lemma 11.4 
we have e is confined. Choose a confining partition of h and let j be such that 
7/ self G dom{Ohj). As x ^ self we have r?self = rjo self. 

— C ^ Rep A C ^ Own: We must show rngrjo fl locs{Repi) = 0, which follows 

because d ^ locs{Repl) and because rng rj n locs{Repl) = by conf C {h, rj). 

— C < Own: We must show rng 77on/ocs(i?ep J,) C dom(i?/ij), which follows because 
{d} n locs{Rep[) C dom{Rhj) by confinement of e at C and because rngr] n 
locs{Repl) C dom{Rhj) by conf C {h,r]). 

— C < Rep: Wc must show rngrjo fl locs{Ownl, Repl) C dom{Ohj * Rhj), which 
follows because {d} fl locs{Ownl, Repl) C dom{Ohj * Rhj) by confinement of e 
at C and because rng ri fl locs{Ownl, Repl) C dom{Ohj * Rhj) by conf C {h, r]). 

Case r>a; := super. m(e). Here ho = hi and r]o = [r] \ xi— Because r>e : f/, 
by Lemma 11.4, e is confined at C. By Lemma 11.3 wc have conf (superC) (h, rji). 
Then by assumption conf ji we get conf {superC) {ho,r]i). Hence ho is confined. To 
show conf C {ho,r]o), we go by cases on C. Recall that i = jjself, and, as a; ^ self, 
£ = r]o self. 

— C ^ Rep AC ^ Own: As C < superC we have superC ^ Rep A superC ^ Own. 
By conf /x, di ^ locs{Repi). Hence rng rjo n locs{Repi) = by conf C {h, rj). 

— C < Own: Let jyself e dom{Ohj) for some j in the confining partition of h. As 
C < superC we have cither .superC < Own {Own < superC is impossible by 
definition of super). By conf ^, di ^ locs{Repl) and h < ho. Hence rngrjo n 
locs{Repl) = rng r] Ci locs{Repi) C dom{Rhj) by conf C {h,ri). As h < ho, 
dom{Rhj) C dom{Rhoj)- That is, rngrjo f^locs{Rep[) C dom{Rhoj)- 
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— C < Rep: Because loctypei < C, let i G dom{Rhj) for some j in the confining 
partition of h. As C < superC we have either superC < Rep {Rep < superC 
is impossible by definition of super). By conf /.j,, di G locs{Ownl, Repl) =^ di G 
dom{Ohoj * Rho-) and h < ho. Hence rngrjo H locs{Owni, -Rep J,) C dom{Oho^ * 
Rho-) by conf C {h,ri) and Definition 6.3. 

Case r 1> 6*1; 5*2. By induction on 5i, h\ is confined and conf C {hi,r]i). More- 
over, if Si is a method call, it has confined argument values. Now by induction on 

82, h2 is confined and conf C {h2,ri2). And, if S2 is a method call, it has confined 
argument values. Hence all method calls in Si; S2 have confined argument values. 

Case Tl>ii e then 6*1 else 5*2 fi. By Lemma 6.11, e is confined at C. If 6 = true, 
result follows by induction on Si and if 6 = false, result follows by induction on ^2. 

Case F > T a; := e in 5. Because T \> e:U we have by Lemma 11.4 that e is 
confined at C. And, because x ^ self and conf C {h, 77), we get conf C {h, rji). Since 
r,x:T t> S, by induction on S we have conf C {hi, 7^2) and all method calls in S 
have confined argument values. Hence hi is confined and conf C {hi,r]2 [ x) and 
all method calls mT x ■.= ein S have confined argument values. 
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